In the OP screen share, they toggle various telemetry options on and off, but every time a setting changes, there is a pop-up that says "a setting has changed that requires a restart [of the editor] to take effect" -- and the user just hits "cancel" and doesn't restart the editor. Then, unsurprisingly, the observed behavior doesn't change. Maybe I'm dumb and/or restarting the editor doesn't actually make a difference, but at least superficially, I'm not sure you can draw useful conclusions from this kind of testing...
edit: to be clear I see that they X-out the topmost window of the editor and then re-launch from the bottom bar, but it's not obvious that this is actually restarting the stuff that matters
Thanks for watching and catching that. It seems like a major oversight for the core claim: That disabling telemetry doesn’t work. If a restart is required and the tests ignored the restart warning that would invalidate the tests.
Either way, it’s useful to see the telemetry payloads.
It was rough a few years ago, but nowadays it's pretty nice. TI rebuilt their Code Composer Studio using Theia so it does have some larger users. It has LSP support and the same Monaco editor backend - which is all I need.
It's VSCode-with-an-Eclipse-feel to it - which might or might not be your cup of tea, but it's an alternative.
Agreed not the most well thought landing page, but the explore page gives a good insight of how it’s being used and what it looks like: https://theia-ide.org/theia-platform/
(Scroll down to Selected Tools based on Eclipse Theia)
The feature that keeps me from moving off of vscode is their markdown support. In particular the ability to drag and drop to insert links to files and images *. Surprisingly, no other editor does this even though I use it all the time.
I took the plunge and don't regret it. Despite the condition of the web site the extension is very useful and relatively free of any annoying bugs.
At some point in time, I'd like to take the time to invest building a custom version of the extension to bump dependencies to access more modern support for plantuml/mermaid diagrams.
I belong to the class of people who believe in customising their tools as they please. So I'd have written an Emacs package to do this. But then again, this is Emacs, so someone's probably already done it. Oh, here it is: https://github.com/mooreryan/markdown-dnd-images
Yeah , INSEAD of forking vscode which is not modification friendly they should justuse theia because it is maintained to be modular and allowed to be used like a Library to build IDEs of your choice.
Eclipse (as in ecosystem) is fairly popular in Enterprise, but since it exposes all the knobs, and is a bona fide IDE which has some learning curve, people stay away from it.
Also it used to be kinda heavy, but it became lighter because of Moore's law and good code management practices all over the board.
I'm planning to deploy Theia in its web based form if possible, but still didn't have the time to tinker with that one.
Using Eclipse as "the Java LSP" in VSCode makes more sense now.
Nevertheless, as much as I respect Erich for what he did for Eclipse, I won't be able to follow him to VSCode, since I don't respect Microsoft as much.
So not also using Github, LinkedIn, TypeScript (any FE framework that uses it), any Microsoft owned studios games, no Linux kernel contributions, GHC contributions,....
It is kid of hard to avoid nowadays.
Here a session with him related to VSCode history,
"The Story of Visual Studio Code with Erich Gamma and Kai Maetzel"
This is why I used "(as in ecosystem)" in the first paragraph. It was a bit late when I wrote this comment, and it turned out to be very blurry meaning wise.
What's wrong with that? If they re-implement the whole thing it would amount to the same code size. It's the JDT language SERVER not some sort of "headless" software with UI needlessly bundled.
Java isn't quite what I think of as lightweight. I mean it probably can be, but most Java engineering I know of is all about adding more and more libraries, frameworks, checks, tests, etc.
I would be interested to see a similar analysis of ByteDance's video editor, CapCut (desktop version). The editor itself is amazing, IMO it has the best UI of any video editing software I've used. Surely, it's full of telemetry and/or spyware, though, but it would be good to know to which extent. I couldn't find any such analysis.
Great analysis, well done !
Since you've already done VSCode, Trae, Cursor, can you analyse Kiro (AWS fork). I'm curious about their data collection practices.
Anecdata but Kiro is much, much, much, much easier to put through corporate procurement compared to its peers. I'm talking days vs months.
This is not because it is better and I've seen no inclination that it would somehow be more private or secure, but most enterprises already share their proprietary data with AWS and have an agreement with AWS that their TAMs will gladly usher Kiro usage under.
Interesting to distinguish that privacy/security as it relates to individuals is taken at face value, while when it relates to corporations it is taken at disclosure value.
This seems perfectly rational. If you're already entrusting all your technical infrastructure to AWS, then adding another AWS service doesn't add any additional supply-chain risk, whereas adding something from another vendor does do that.
I don't want any program on my computer including the OS to make any network calls whatsoever unless they're directly associated with executing GUI/CLI interactions I have currently undertaken as the user. Any exception should be opt-in. IMHO the entire Overton window of these remote communications is in the wrong place.
But the telemetry settings not working and the actions of the Trae moderators to quash any discussion of the telemetry is extremely concerning. People should be able to make informed decisions about the programs they are using, and the Trae developers don't seem to agree.
To further the analogy, sex may be an industry, but not everyone who participates does so comercially. Some who do so comercially may not want to be filmed.
we don't have to accept it. but people say "it's just how we do it" and suddenly people just accept it.
i really feel like our society is going to collapse soon, if it hasn't already begun to. the amount of total crap that people are put through just so that ads can be more targeted to users. we are creating a hellscape for privacy and freedom just so people click on ads. it is pure and complete insanity, and no one cares.
I was interested in learning Dart until the installer told me Google would be collecting telemetry. For a programming language. I’ve never looked at it again.
As a somewhat paranoid person I find this level of paranoia beyond me. Like do you own a car? Or a phone? A credit card? Walk around in public where there's cameras on every block? I don't agree with it at all but the world we're living it makes it impossible to not be tracked with way more than (usually anonymized) telemetry data.
It's not nihilism. I still ad block, use an RFID wallet, and don't install any apps on my phone, I rarely use google for anything. But at some point when something is so ridiculously useful and the data they're getting doesn't really mean much of anything I have to stop caring. I use Windows 11 (gross) because it lets me play video games with my friends I can't otherwise. I use Uber because it lets me get across town. I use Visual Studio because it helps me code. I use Chatgpt because it helps me with so many things. To take away any one of those because I'm a privacy absolutist seems silly to me. It has the exact same vibe of never leaving your room because you're afraid of all the cameras.
I'd like there to be a push back against these companies because I find their practices disgusting but running linux with only open source software and a fairphone is just an extreme I'm unwilling to entertain because it's just not possible in my (or most people's) world.
In fairness, as others have pointed out, the phone is much more personal than the home computer. Your phone is almost always with you, collecting much more intimate data than your PC can.
I didn't say the phone was more trusted, I said it was more personal.
Your phone almost certainly knows where you are at all times, for example. It may know whether you're walking or sitting. It knows who calls you, who you call, who you message.
The laptop may know some of that, but it doesn't have the same sensors, and doesn't stay with you most times you leave the house.
I think the thing you neglect when having setups like this is that you start to garner interest from law enforcement if they ever come across you. You're trying so hard to cover your tracks that you stand out very clearly in a crowd.
There's a middle ground between living deep in the woods without windows and walking around naked in public.
It seems you are talking about Social Cooling, https://news.ycombinator.com/item?id=24627363. The more people like me exist, the easier it will be for actual activists and journalists to do their work. Privacy and anonymity are crucial for democracy.
"file paths (obfuscated)" -- this is likely enough for them to work out who the user is, if they work on open source software. They get granular timing data and the files the user has edited, which they could match with open source PRs in their analytics pipeline.
I suspect they aren't actually doing that, but the GDPR cares not what you're doing with the data, but what is possible with it, hence why any identifier (even "obfuscated") which could lead back to a user is considered PII.
Honestly, I found this whole thread kind of strange. There’s nothing here beyond what most connected IDEs — or even basic office software — already collect by default.
It feels like the goal was more about grabbing attention than raising a real issue. But sure, toss “ByteDance” and “data” into a headline and suddenly it’s breaking news. I'm just tired of this kind of "Big News"- it's boring.
This is true of practically every online community. The vast majority of the users are passive participants, a small fraction contribute, and a small subset of contributors generate most of the content. Reddit is a prime example of this, the numbers are incredibly lopsided there.
This isn't true, this is the sort of toxic "if I have nothing to hide then why value privacy" ideology that got us into this privacy nightmare.
Every single person has "something to hide", and that's normal. It's normal to not want your messages snooped through. It doesn't mean you're a criminal, or even computer-saavy.
Mhhh it is not really about “nothing to hide”, it was more that if you use niche services targeted at privacy, it puts a big target on you.
Like the Casio watches, travelling to Syria, using Tor, Protonmail, etc…
When it is better in reality to have a regular watch, a Gmail with encrypted .zip files or whatever, etc.
It does not mean you are a criminal if you have that Casio watch, but if you have this, plus encrypted emails, plus travel to some countries as a tourist, you are almost certain to put yourself in trouble for nothing, while you tried to protect yourself.
And if you are a criminal, you will put yourself in trouble too, also for nothing, while you tried to protect yourself.
This was the basis of Xkeyscore, and all of that to say that Signal is one very good signal that the person may be interesting.
2. Using a secure, but niche, service is still more secure than using a service with no privacy.
Sure, you can argue using Signal puts a "target" on your back. But there's nothing to target, right? Because it's not being run by Google or Meta. What are they gonna take? There's no data to leak about you.
If I were a criminal, which I'm not, I'd rather rob a bank with an actual gun than with a squirt gun. Even though having an actual gun puts a bigger target on your back. Because the actual gun works - the squirt gun is just kinda... useless.
>If I were a criminal, which I'm not, I'd rather rob a bank with an actual gun than with a squirt gun. Even though having an actual gun puts a bigger target on your back. Because the actual gun works - the squirt gun is just kinda... useless
Actually, there was a case... I can't recall but it might have been in Argentina, where the robbers did explicitly use fake guns when robbing the banks because doing so still actually worked for the purposes of the robbery, and it also reduced their legal liability.
It's a dark pattern called "placebo controls" - giving users the illusion of choice maintains positive sentiment while maximizing data collection, and avoids the PR hit of admitting telemetry is mandatory.
Telemetry toggles add noise to the data at the very least. IMO it's part of the reason you're actually better off with no client-side telemetry at all. Obviously they see it the opposite way.
Your analysis is thorough, and I wonder if their reduction of processes from 33 to 20...(WOW) had anything to do with moving telemetry logic elsewhere (hence increased endpoint activity).
I so much like the fact that I've come back to TUI (helix editor) recently.
I'm trying ZED too, which I believe as a commercial product comes with telemetry too.. but yeah, learning advanced rules of a personal firewall always helpful!
1. Try using pi-hole to block those particular endpoints via making DNS resolution fail; see if it still works if it can’t access the telemetry endpoints.
2. Their ridiculous tracking, disregard of the user preference to not send telemetry, and behavior on the Discord when you mentioned tracking says everything you need to know about the company. You cannot change them. If you don’t want to be tracked, then stay away from Bytedance.
Hate to break it to you, but /etc/hosts only works for apps that use getaddrinfo or similar APIs. Anything that does its own DNS resolution, which coincidentally includes anything Chromium-based, is free to ignore your hosts file.
But pi-hole seems equally susceptible to the same issue? If you're really serious about blocking you'd need some sort of firewall that can intercept TLS connections and parse SNI headers, which typically requires specialized hardware and/or beefy processor if you want reasonable throughput speeds.
I configured my router to redirect all outbound port 53 udp traffic to adguard home running on a raspberry pi. From the log, it seems to be working reasonably enough, especially for apps that do their own dns resolution like the netflix app on my chromecast. Hopefully they don't switch to dns over https any time soon to circumvent it.
DNS over https depends on the ability to resolve the DoH hostname via DNS, which is blockable via PiHole, or depend on a set of static IPs, which can be blocked by your favorite firewall.
A sufficiently spiteful app could host a DoH resolver/proxy on the same server as its api server (eg. api.example.com/dns-query), which would make it impossible for you to override DNS settings for the app without breaking the app itself.
In the context of snooping on the SNI extension, you definitely can.
The SNI extension is sent unencrypted as part of the ClientHello (first part of the TLS handshake). Any router along the way see the hostname that the client provides in the SNI data, and can/could drop the packet if they so choose.
When the nefarious actor is already inside the house, who knows to what lengths they will go to circumvent the protections? External network blocker is more straightforward (packets go in, packets go out), so easier to ensure that there is nothing funny happening.
On Apple devices, first-party applications get to circumvent LittleSnitch-like filtering. Presumably harder to hide this kind of activity on Linux, but then you need to have the expertise to be aware of the gaps. Docker still punches through your firewall configuration.
So that these domains are automatically blocked on all devices on a local network. Also, you can't really edit the hosts file on Android or iOS, but I guess mobile OSes are not part of the discussion here.
Although there are caveats -- if an app decides to use its own DNS server, sometimes secure DNS, you are still out of luck. I just recently discovered that Android webview may bypass whatever DNS your Wi-Fi points to.
Yeah, that was my point. I'm not sure what's so breath taking about what ByteDance is doing. I'm not a fan. But, with Meta, Google, Microsoft and I'll throw on Amazon, a huge chunk of the general public's web activity is tracked. Everywhere. All the time. The people have spoken, they are okay with being tracked. I've yet to talk with a non-technical person who was shocked that their online activity was tracked. They know it is. They assume it is. ByteDance's range of telemetry does not matter to them. Just wanna keep on tiktok'ing. Why does telemetry sent to Bytedance matter? Is it a China thing? I'm not concerned about a data profile on me in China. I'm concerned about the ones here in the US. I'll stop. I'm not sure I have a coherent point.
I can also suggest OpenSnitch or Portmaster to anyone whose conscious about these network connections. I couldn't live without them, never trust opt-outs.
Naming is hard but if there really were 2 different AI IDEs with nearly identically name that's no accident.
But it seems like traeide.com is in the best case someones extremely misleading web design demo, worst case a scam.
One the traeide website:
> Educational demo only. Not affiliated with ByteDance's Trae AI. Is Trae IDE really free? What's the catch? Yes, Trae IDE is completely free with no hidden costs. As a ByteDance product, it is committed to making advanced AI coding tools accessible to all developers.
By the way TRAE isn't free anymore, they now provide a premium subscription.
If the later really is just a web design demo it has a bunch of red flags. Why the officially sounding domain? Download links for executables!!! If it is just a web design demo for portfolio -> why are there no contact information for the author whose work and skills it's supposed to advertise?
It's cheap, the ai features cost about half of what other editors are charging ($10/mo) and the free tier has generous limit. I guess you pay the difference with something else :)
I’m one of those who use it—mainly because it’s cheap, as others have mentioned. I wish Cursor offered a more generous limit so I wouldn’t need another paid subscription. But it won’t. So Trae comes in — fulfilling that need and sealing the deal. This is what we call competition: it brings more freedom and helps everyone get what they want. Kudos to the competition!
I'm not defending Trae’s telemetry — just pointing out the hard truth about why pricing works and why many people care less about privacy concerns (all because there are no better alternatives for them, considering the price.)
By the way, for those who care more about pricing($7.5/M) — here you go: https://www.trae.ai/. It’s still not as good as Cursor overall (just my personal opinion), but it’s quite capable now and is evolving fast — they even changed their logo in a very short time. Maybe someday it could be as competitive as Cursor — or even more so.
I wonder how many of these telemetry events can sneakily exfiltrate arbitrary data like source code. For example, they could encode arbitrary data into span IDs, timestamps (millisecond and nanosecond components), or other per-event UIDs. It may be slow...but surely it's possible.
Telemetry isn't the same thing as spying on user data. You should read below if you are not clear about the differences:
What is telemetry data?
Telemetry is anonymous, aggregated usage data that helps developers understand how a product is being used. It usually includes stuff like:
- What features are being used (e.g. “X% of users use the terminal panel daily”)
- How often the app crashes, and where
- Performance stats (e.g. memory usage, load times)
- Environment info (like OS version, screen resolution — not personal files)
It does NOT include your source code, files, passwords, browser content, or any personal identifiers unless explicitly stated. And in most reputable products, it’s either anonymized or pseudonymized by default.
Who uses telemetry? Everyone.
- VS Code collects telemetry to improve editor performance and user experience. For products that are vscode fork, they inherit the vscode telemetry switch by default.
- Chrome uses telemetry to understand browser performance, crashes, and feature adoption, -Slack, Discord, Postman all rely on telemetry to debug, prioritize features, and improve product quality
Without telemetry, you can’t know which features are working, where users are getting stuck and what’s actually causing bugs.
So when people say “telemetry = privacy breach,” they’re confusing helpful system analytics with data exploitation. The real concern should be around what is collected, how it’s stored, and whether users can opt out — not the mere existence of telemetry data itself.
Telemetry data itself doesn’t directly cause high CPU usage. CPU hog could cause by a lot of different reasons.
Sharing here just because some reasonings and arguments in the original post seems off. Don’t want people to get confused
Well there's a middle ground - Sublime Text isn't free but it's fantastic and isn't sending back all my code/work to the Chinese Government. Sorry, "Telemetry"
And the other side of the middle ground, Grafana being AGPL but requiring you to disable 4 analytics flags, 1 gravatar flag, and (I think) one of their default dashboards was also fetching news from a Grafana URL.
As for why people outside these companies use their products, it usually comes down to two reasons: a) Their employer has purchased licenses and wants employees to use them, either for compliance or to get value from the investment; or b) They genuinely like the product—whether it’s because of its features, price, performance, support, or overall experience.
Hmm. Are you aware that I was responding to this comment?
> Why do people use obvious spyware when free software exists?
So, even though the poster was referring to ByteDance when they said "obvious spyware", I was feigning incomprehension in order to ask the question, how do we differentiate ByteDance from what Microsoft, Apple, Google, Amazon (and the rest) do.
It's a real question - why do technical people, who arguably should know better, and can do something about it - continue to use these data-harvesting and user-selling platforms? The answer is obvious when it's the case of an employee of those companies, I grant you that.
My apologies if you feel your response did address that, and I missed it. If so, please help me see what I missed.
And the Snowden revelations happened, which programmers and sysadmins and etc saw, and then... continued as before, in the large majority of cases. It'd be baffling, if it wasn't so easily explained by the usual mixture of self-interest and moral cowardice.
Spying and telemetry is not something specific to Bytedance. Example: Google ? Or Microsoft ? Why is it a problem only when it is Bytedance or Huawei ? For the exact same activity
In fact the Chinese entities are even less likely to share your secrets to your governement than their best friends at Google
No one in the chain of comments you are replying to has mentioned anything about Google, and on HackerNews you will find the majority sentiment is against spying in all forms - especially by Google, Meta, etc.
Even if we interact with your rhetoric[1] at face value, there is a big difference between data going to your own elected government versus that of a foreign adversary.
So you are implying at the end that it is better that your secrets (“telemetry”) go to your local agencies and to possible relatives or family who work on Gmail, Uber, etc ?
I'm sorry but why? Your government can use this data to actually hurt you and put you on the no-fly list, or even put you in prison.
But a foreign government is limited to what it can do to you if you are not a very high-value target.
So I try as much as possible to use software and services from a non-friendly government because this is the highest guarantee that my data will not be used against me in the future.
And since we can all agree that any data that is collected will end up with the government some way or another. Using forging software is the only real guarantee.
Unless the software is open source and its server is self-hosted, it should be considered Spyware.
In my mind, the difference is that spying does or can contain PII, or PII can be inferred from it, where telemetry is incapable of being linked to an individual, to a reasonable extent.
Every single piece of telemetry sent over the internet includes PII - the IP address of the sender - by virtue of how our internet protocols are designed.
Apple provides telemetry services that strips the IP before providing it to the app owners. Routing like this requires trust (just as a VPN does), but it's feasible.
You said it's different from spying because there is no PII in the information. Now you're saying it's different because it's not given to app owners.
Why is it relevant whether they provide it to app owners directly? The issue people have is the information is logged now and abused later, in whatever form.
Which has clear logically consistency, at the app owner level, which is the context of my reply.
If the app owner can't obtain PII, I don't believe the app owner is spying.
Is Apple spying?
> Routing like this requires trust
It depends on if you trust them, and their privacy policy. If they're functioning as a PII stripping proxy, as they claim, then I would claim no, to the extent of what's technically possible. I would also claim that a trustworthy VPN is not spying on you. YOMV.
This is like saying every physical business is collecting PII because employees can technically take a photo of a customer. It's hard to do business without the possibility of collecting PII.
No, it's like saying a business that has a CCTV camera recording customers, and sending that data off site to a central location, where they proceed to proceed to use the data for some non-PII-related purpose (maybe they're tracking where in stores people walk, on average), are in fact sending PII to that off site location.
Distinguishing factors from your example include
1. PII is actually encoded and handled by computer systems, not the mere capability for that to occur.
2. PII is actually sent off site, not merely able to be sent off site.
3. It doesn't assert that the PII is collected, which could imply storage, it merely asserts that it is sent as my original post does. We don't know whether or not it is stored after being received and processed.
If you imagine the CCTV camera in my example is a film-video-camera and the processing happening off site is happening in a dark room and not on a computer... my more accurate version of your analogy is also analog.
Those who collect PII, anonymized or not, are collecting information for one or more legitimate purposes, and that same information lends itself to ends which can reasonably be construed as spying when it is inevitably exposed to those who desire to spy. Those app developers can’t plausibly deny knowing that this information sharing will occur or is exceedingly likely to occur, and by making such data collection opt-out, app developers knowingly are acting on behalf of spies, despite having no intention to directly spy themselves. If you are an app developer with opt-out telemetry or an end user of an app so developed, who is the spy or doing the spying is a distinction without a difference to my view.
Anonymized or not, opt-out telemetry is plain spying. Go was about to find out, and they backed out the last millisecond and converted to opt-in, for example.
Telemetry can be implemented well. The software you use gets bugs fixed much faster since you get statistics that some bugs have higher impact than others. The more users software has, less skills they have in average to accurately report any issues.
The PowerShell team at Microsoft added opt-out telemetry to track when it was launched so they could make the case internally that they should get more funding, and have more internal clout.
It’s easy to argue that if you are a PowerShell user or developer you benefit from no telemetry, but it’s hard to argue that you benefit from the tool you use being sidelined or defunded because corporate thinks nobody uses it. “Talk to your users” doesn’t solve this because there are millions of computers running scripts and no way to know who they are or contact them even if you could contact that many people, and they would not remember how often they launched it.
> it’s hard to argue that you benefit from the tool you use being sidelined or defunded because corporate thinks nobody uses it.
Let the corporation suffer then. With an open API, a third party will make a better one. Microsoft can buy that; corporations have a habit of doing that.
> “Talk to your users” doesn’t solve this because there are millions of computers running scripts
Why are you worried about the problems that scripts face? If the developer encounters issues in scripts, the developer can work to fix it. Sometimes that might mean filing a bug report... or a feature request for better documentation. Or the developer might get frustrated and use something better. Like bash.
> there are millions of computers running scripts and no way to know who they are or contact them
Why do they matter to you, or a corporation then?
> they would not remember how often they launched it.
If your users aren't interacting with you for feature requests and bug reports, then either you don't have users or you don't have good enough reachability from the users to you.
> "Why are you worried about the problems that scripts face? Why do they matter to you?"
because I write and run such scripts.
> "Let the corporation suffer then"
Microsoft wouldn't suffer, PowerShell users would suffer.
> "sometimes that might mean filing a bug report... or a feature request for better documentation. "
In this scenario the PowerShell team has been defunded or sacked. Who will the bug report go to? Who will implement the feature request?
> "If your users aren't interacting with you for feature requests and bug reports, then either you don't have users or you don't have good enough reachability from the users to you."
Users are interacting with Microsoft for feature requests and bug reports. There are a thousand open issues on https://github.com/powershell/powershell/ and many more which were closed "due to inactivity". What difference does that make if Corporate doesn't want to fund a bigger team to fix more bugs unless it can be shown to benefit a lot of customers not just "a few" devs who raise issues?
'kay. Learn how to do Engineering and the software will come just fine. You don't need telemetry to tell you anything about scripts. You need good error reports for your users to send to you instead.
> Microsoft wouldn't suffer, PowerShell users would suffer.
So what you're saying is that Microsoft doesn't care about its users. PowerShell users should use products from better companies then.
> In this scenario the PowerShell team has been defunded or sacked. Who will the bug report go to? Who will implement the feature request?
Why were they sacked?
Oh, right, because they didn't interact with their users.
Who will the bug report go to? Clearly it's the same as before: nobody. That's a Microsoft problem.
> What difference does that make if Corporate doesn't want to fund a bigger team to fix more bugs unless it can be shown to benefit a lot of customers not just "a few" devs who raise issues?
If Corporate doesn't want to fund bugfixes and features for people who actually file bug reports and talk to you, then that's poor behavior of corporate. Why do you want to contribute to the decline of your users privacy?
To take that logic to its extreme: I'm sure we could have amazing medical breakthroughs if we just gave up that pesky 'don't experiment on non-consenting humans' hang-up we have.
The parent said "talk to your users instead of telemetry" and I said "there are scenarios where telemetry can get information that you cannot get by talking to users". How did you go from that to "experimenting on non-consenting humans"?
To take your logic to its extreme, you have a disease and are prescribed pills, and the pharmaceutical company says "we will track when you take the pills - unless you don't want us to?" and you would prefer the researchers get shut down for not knowing whether anyone actually takes the pills, and an unlimited number of people die from treatable diseases that don't get cured.
Medical research and consent doesn't work like this. If you track your patients without their consent, or you share their data without their explicit consent, you'll land in very hot water, which will cook you even before you can scream.
Similarly, a medical trial will take a very detailed consent before you can start.
Your opt-out telemetry is akin to your insurance sending you powered and Bluetooth enabled toothbrushes out of the blue to track you and threaten to cancel your insurance if you don't use that toothbrush and send data to them.
Or as a more extreme example, going through an important procedure not with the known and proven method but with an experimental one, because you didn't opt-out and nobody bothered to tell you this. In reality, you need to sign consent and waiver forms to accept experimental methods.
> "Medical research and consent doesn't work like this."
Yes, I agree that person's comparison to non-consensual medical research is stupid.
> "Your opt-out telemetry is akin to your insurance sending you powered and Bluetooth enabled toothbrushes out of the blue to track you and threaten to cancel your insurance if you don't use that toothbrush and send data to them."
More akin to your insurance company making a public RFC where you can discuss the coming telemetry, then you choosing to ask your insurance for an optional toothbrush, being able to opt out of telemetry if you want to, the insurance company documenting how to opt out[1], you being able to edit the toothbrush source code to remove the telemetry entirely with the insurance company's approval because it's MIT licensed, and absolutely nothing happening to you if you opt out.
> I don't understand how you got from "there are scenarios where telemetry can get information that you cannot get by talking to users, here is one example" to "experimenting on non-consenting humans". What is the connection?
The connection is clear if your salary doesn't require you to not understand it.
Developers don't opt-in to telemetry? Maybe it's because they don't want to enable that telemetry, your experiments be damned.
Use proper engineering to demonstrate that your scripts work instead of demanding that users be your free software test team.
You said 'but we wouldn't have a lot of improvement without telemetry'. I am saying that we could have a lot of improvement in a lot of things if we wanted. We could have breakthroughs in medicine if we allowed human experimentation. The question is, where is that that line? Your argument doesn't address that, it just tries to justify something that people think it morally wrong by stating that we get use from it.
> "You said 'but we wouldn't have a lot of improvement without telemetry'."
I did not say that. Within the context of Microsoft's internal funding, maybe, but we could have the same improvement by Microsoft throwing more money at the PowerShell team without this telemetry. The core thing I said was that the information the telemetry gets cannot be got by "talk to your users" not that the telemetry leads to amazing improvements.
It is still difficult for you to make the case that someone choosing to download PowerShell can be "not consenting" (and before you reply saying "PowerShell ships with Windows", the PowerShell which has telemetry does not [yet] ship with Windows).
That's only after you read this article. The question is how do you know it's spyware even before you install it. At least it's not clear for me from github README file that I would be knowningly installing spyware.
I'm with you, but I don't see the problem with their argument. They should have mentioned GDB, Valgrind, and maybe things like pdb and ruff, but I think their point was clear enough without it. Hell, in vim I use ruff for linting and you can jump into a debugger. When you have it configured that way people do refer to it as an IDE. It isn't technically correct but it gets the point across to the people who wouldn't know that
What is there in an IDE today, that is missing from (n)vim? With the advent of DAP and LSP servers, I can't find anything that I would use a "proper" IDE for.
- popup context windows for docs (kind of there, but having to respect the default character grid makes them much less capable and usually they don't allow further interaction)
- contextual buttons on a line of code (sure, custom commands exist, but they're not discoverable)
Don't IDEs use DAP as well? That would mean neovim has 1:1 feature parity with IDEs when it comes to debugging. I understand the UI/UX might need some customization, but it's not like the defaults in whatever IDE fit everyone either.
Popup context windows for docs are super good in neovim, I would make a bet that they are actually better than what you find in IDEs, because they can use treesitter for automatic syntax highlighting of example code. Not sure what you mean with further interaction.
Contextual buttons are named code actions, and are available, and there are like 4 minimap plugins to choose from.
Sorry I don't know enough about VS to answer this, but if the debugger in question is using DAP then there is a more than fair chance it's available in neovim as well.
These are called "balloon"s[1]. Plenty of people have setups for things like docs (press "K") or other things (By default "K" assumes a man page)
> contextual buttons on a line of code
I don't know what this means, can you explain?
> minimap
Do you mean something like this?[2] Personally, I use tagbar[3] as I like using ctags and being able to jump around in the project.
The "minimap" is the only one here that isn't native. You can also have the file tree on the left if you want. Most people tend to use NerdTree[4], but like with a lot of plugins, there's builtins that are just as good. Here's the help page for netrw[5], vim's native File Explorer
Btw, this all works in vim. No need for neovim for any of this stuff. Except for the debugger, this stuff has been here for quite some time. The debugger has been around as a plugin for awhile too. All this stuff has been here since I started using vim, which was over a decade ago (maybe balloons didn't have as good of an interface? Idk, it's been awhile)
And are not interactive as far as I know. I've not seen a way to get a balloon on the type in another balloon and then go to the browser docs from a link in that.
> Do you mean something like this?
Yes, but that's still restricted to terminal characters (you could probably do something fancy with sixel, but still) - for larger files with big indents it's not useful anymore.
> contextual buttons on a line of code
For example options to refactor based on the current location. I could construct this manually from 3 different pieces, but this exists in other IDEs already integrated and configured by default. Basically where's the "extract this as named constant", "rename this type across the project" and others that I don't have to implement from scratch.
I mean you use completion, right? That's interaction? In insert mode <C-p> or <C-n>, same to scroll through options.
> [tabbar is] still restricted to terminal characters (you could probably do something fancy with sixel,
Wait... you want it as an image? I mean... sure? You could, but I'm really curious why you would want that. I told you this was one option, but there are others. Are you referring to the one that was more visual and didn't show actual text? Idk, I'm not going to hunt down that plugin for you and I'm willing to bet you that it exists.
> For example options to refactor based on the current location.
First off, when quoting it helps to add more >'s to clarify the depth. So ">>>" in this case. I was confused at first as I didn't say those words (Also, try adding two leading spaces ;)
Second, sure, I refactor all the time. There's 3 methods I know. The best way is probably with bufdo and having all the files opened in a buffer (tabs, windows, or panes are not required). But I'm not sure why this is surprising. Maybe you don't know what ctags are? If not, they are what makes all that possible and I'd check them out because I think it will answer a lot of your questions.
> Basically where's the "extract this as named constant", "rename this type across the project"
Correct me if I'm wrong, but you are asking about "search and replace" right? I really do recommend reading about ctags and I think these two docs will give you answers to a lot more things that just this question[0,1]. Hell, there's even The Primeagen's refactoring plugin in case you wanted to do it another way that's not vim-native.
But honestly, I really can't tell if you're just curious or trying to defend your earlier position. I mean if you're curious and want to learn more we can totally continue and I'm sure others would love to add more. And in that case I would avoid language like "vim doesn't" and instead phrase it as "can vim ___?", "how would I do ____ in vim?", or "I find ___ useful in VS code, how do people do this in vim?" Any of those will have the same result but not be aggressive. But if you're just trying to defend your position, well... Sun Tzu said you should know your enemy and I don't think you know your enemy.
Very basic one. What I mean is once you get the completion, how do you interact with that view - let's say you want to dig into a type that's displayed. Then you want to get to the longer docs for that type. There's nothing out there that does it as far as I know.
> Wait... you want it as an image?
Yes, the asciiart minimaps are cool, but they really don't have enough resolution for more complex longer files in my experience.
> The best way is probably with bufdo and having all the files opened in a buffer
You see why this is not great, right? That's an extra thing to think about.
> Maybe you don't know what ctags are?
I know. It's step 1 out of many for implementing proper refactoring system.
> but you are asking about "search and replace" right?
Search and replace with language and context awareness. You can diy it in vim or start stacking plugins. Then you can do the same with the next feature (like inserting method stub). But... I can just use an actual IDE with vim mode instead.
> And in that case I would avoid language like "vim doesn't"
Vim doesn't do those things though. There's a whole ecosystem of additions of plugins of the day that add one thing or another. But it turns out it's easier to embed nvim in an ide than play with vim plugins until you get something close to ide. Been there for years, done that, got tired. VS with vim mode has better ide features than vim with all the customised plugins.
I guess because I don't use VSC I don't know what you're talking about (can you show me?) but getting docs is not an issue to me. If I want the doc on a function I press K in normal mode.
> That's an extra thing to think about.
Is it? I mean the difference is literally
%s/foo/bar/g
bufdo %s/foo/bar/g
I don't see how that's more than what you'd do in any other system. You want to be able to replace one instance, all instances in the file, and all instances everywhere, right? Those can't all be exact same commands.
And it's not very hard to remember things like bufdo, windo, tabdo because I'm already familiar with a buffer, tab, and window. It's not an extra item in memory for me, so no, I don't see. It's just as easy and clear as if I clicked a button that said "do you all files"
> Search and replace with language and context awareness
You mean ins-completion? That's native. I can complete things from other files (buffers), ctags, and whatever. You can enable auto suggest if you really want but that's invasive for me and distracting. But to each their own. I mean the right setup is only the right setup for you, right?
> Vim doesn't do those things though.
Yet I'm really not sure what's missing. I'll give you the minimap but I personally don't really care about that one. Is it that big of a deal? (I already know what percentage of the file I'm in and personally I'd rather the real estate be used for other things. But that's me). But so far this conversation has been you telling me vim doesn't do something, me showing you it does, and you just saying no. To me it just sounds like you don't know vim. It's cool, most people don't read docs ¯\_(ツ)_/¯
I mean there's a lot of stuff that people that have been using vim for years don't know but is in vimtutor. I mean how many people don't know about basic things like ci, competition (including line or file path), or <C-[>? How many people use :wq lol
I just like vim man. You don't have to, that's okay. I like that I can do marks. I love the powerful substitution system. I mean I can just write the signatures of my init functions and automatically create the class variables. Or deal with weird situations like this time some Python code had its documentation above the function and I could just bufdo a string replace to turn those into proper docstrings. I love that I can write macros on the fly, trivially, and can apply them generously. I love registers and how powerful they are. I mean I can write the command I want on a line, push it into a register, and then just call it with @. It's trivial to add to my rc if I like it enough. I love that it's really easy to drop in a local config file that sets the standards for the project I'm working on when it differs from my defaults and I can even share that with everyone! I really like the fact that I can have my editor on just about every nix machine and I don't even need to install it. I can work effectively on a novel machine disconnected from the internet.
I mean my love for vim isn't really just the navigation. But even in the navigation I'm constantly using bindings that most vim plugins don't have. It's not easier for me to use another system and add vim keybindings because that's only a very small portion of vim. I'd rather have all of vim and a fuck ton more of my resources.
I don't think you understand what I mean with the language aware rename. It's not even close to %s. Let's say I've got a c# app and I rename a class in VS. This will rename the class in the file, all class usages (but not as text - if I rename A.B, then it will not touch X.B due to different namespaces), including other projects in the solution, optionally will rename the file it lives in and optionally will/won't replace the text in comments. All listed for review and approval and I don't have to have any of those files open ahead of time.
This is something that the LSP provides (even in VScode), and is available in nvim yes. The command is vim.lsp.buf.rename(), and it is bound to "grn" by default.
All the other similar fanciness like renaming a file and automatically updating module references is also provided by the LSP, and is also available in nvim.
gcc/as/ld are batch processors from the GNU toolchain that offer few (if any) features beyond basic C/C++ (and a handful of other languages) as support, and they're non-standard toolchains on 2 out of 3 major operating systems requiring a bit of heavy lifting to use.
It's kind of nonsense to bring them up in this conversation.
I install vscode from scratch, install a few extensions I need, set 3 or 4 settings I use regularly, and bang in 5 minutes I have a customized, working environment catered for almost any language.
vi? Good luck with that.
And I say that as an experienced vim user who used to tinker a bit.
> I install vscode from scratch, install a few extensions I need, set 3 or 4 settings I use regularly, and bang in 5 minutes I have a customized, working environment catered for almost any language.
Weird, I'd say that's my experience with vim. I just carry around by dotfiles, which is not that extensive.
Hell, I will even feel comfortable in a vi terminal, though that's extremely rare to actually find. Usually vi is just remapped to vim
Edit:
The git folder with *all* my dotfiles (which includes all my notes) is just 3M, so I can take it anywhere. If I install all the plugins and if I install all the vim plugins I currently have (which some are old and I don't use) the total is ~100M. So...
You misread. I'm using 74K for *vim* configs. (Mostly because I have a few files for organization's sake)
I rounded up to 3M from 2.3M and 1.4M of that is .git lol. 156K is all my rc files, another 124K for anything that goes into ~/.confg, 212K for my notes, 128K for install scripts, 108K for templates, and 108K for scripts
I'll repeat myself, with the *same emphasis* as above. Hopefully it's clearer this time.
>> The git folder with **all** my dotfiles (which includes all my notes) is just 3M
I was just saying it's pretty simple to carry *everything* around, implying that this is nothing in comparison to something like a plugin or even VScode itself. I mean I went to the VScode plugin page and a lot of these plugins are huge. *All* of the plugins I have *combined* (including unused) are 78M. The top two most installed VSC plugins are over 50M. Hell, the ssh plugin is 28M! I don't have a single plugin that big!
props to OP for the screenshots and payloads—that’s how you do it. If any IDE wants trust, they know the recipe - make telemetry optin by default and provide a real kill switch.
Any proof of them not letting you speak because you mentioned “tracking”? This is weird.
I’m in Trae’s Discord (recently decided to try their Pro) and don’t see any regulation. There are multiple users discussing about privacy mode or tos but didn’t see any pushback.
How is memory usage related to telemetry? I don’t see a lot of direct correlation here. If they cut down their mem usage to the same level of cursor did, does that mean they are not sending that much data then?
pretty much every serious dev tool collects some form of usage data. I don’t think this is some evil conspiracy. it’s how teams figure out what’s working, what’s broken, and what needs improving.
you updated the GitHub repo multiple times, and from the looks of it, that so-called “official censorship” was actually just an automod mute.
tying that to telemetry feels like you’re trying to stitch a story together just to sell a narrative
It’s honestly kind of funny — you got timed out by Discord’s automod and now you’re running around calling it “suppression of technical discussion.” lol. Don't be dramatic. There's a wave of crypto spam before, so the automod timeout often catches lots of bots and people. I remember even community mods have been timed out before.
That’s fake news. If you check the GitHub issue and actually visit the Trae community, you’ll see the truth.
The community mod already told him the real reason — it was triggered by the keyword “token”, which has been flagged due to repeated crypto bot spam in the past. But instead, he deliberately claimed it was because of the word “track,” and framed a basic anti-spam automod as “censorship” and “punishment.”
It honestly feels like an attention grab. That kind of intentional misrepresentation is pretty dishonest. And if you check the message history, he was actively chatting in the group before — no one was silencing him.
Honestly great to see this. This is the power that FB/Microsoft/Google have if they ever decided to take the gloves off. Maybe this will be the motivating factor to get some privacy laws with fangs.
If you continue to send telemetry after I explicitly opt-out, then I get to sue (or atleast get a cut of the fines for whistleblowing)
I'm eying Zed. Unfortunately I am dependent on a VS Code extension for a web framework I use. VS Code might have gotten to a critical level of network effect with their extensions, which might make it extremely sticky.
Sad to hear that. I really enjoyed VS Codium before I jumped full-time into Nova.
(Unsolicited plug: If you're looking for a Mac-native IDE, and your needs aren't too out-of-the-ordinary, Nova is worth a try. If nothing else, it's almost as fast as a TUI, and the price is fair.)
> Why isn't there a decently done code editor with VSCode level features but none of the spyware garbage?
Because no other company was willing to spend enough money to reach critical mass other than Microsoft. VSCode became the dominant share of practically every language that it supported within 12-18 months of introduction.
This then allowed things like the Language Server Protocol which only exists because Microsoft reached critical mass and could cram it down everybody's throat.
Because telemtry is how you effectively make a decently done editor. If you don't have telemtry you will be likely lower quality and will be copying from other editors who are able to effectively build what users want.
Hi HN,
I was evaluating IDEs for a personal project and decided to test Trae, ByteDance's fork of VSCode. I immediately noticed some significant performance and privacy issues that I felt were worth sharing. I've written up a full analysis with screenshots, network logs, and data payloads in the linked post.
Here are the key findings:
1. Extreme Resource Consumption:
Out of the box, Trae used 6.3x more RAM (~5.7 GB) and spawned 3.7x more processes (33 total) than a standard VSCode setup with the same project open. The team has since made improvements, but it's still significantly heavier.
2. Telemetry Opt-Out Doesn't Work (It Makes It Worse):
I found Trae was constantly sending data to ByteDance servers (byteoversea.com). I went into the settings and disabled all telemetry. To my surprise, this didn't stop the traffic. In fact, it increased the frequency of batch data collection. The telemetry "off" switch appears to be purely cosmetic.
3. What's Being Sent:
Even with telemetry "disabled," Trae sends detailed payloads including:
Hardware specs (CPU, memory, etc.)
Persistent user, device, and machine IDs
OS version, app language, user name
Granular usage data like time-on-ide, window focus state, and active file types.
4. Community Censorship:
When I tried to discuss these findings on their official Discord, my posts were deleted and my account was muted for 7 days. It seems words like "track" trigger an automated gag rule, which prevents any real discussion about privacy.
I believe developers should be aware of this behavior. The combination of resource drain, non-functional privacy settings, and censorship of technical feedback is a major red flag. The full, detailed analysis with all the evidence (process lists, Fiddler captures, JSON payloads, and screenshots of the Discord moderation) is available at the link. Happy to answer any questions.
VSCode is extremely unsafe and you should only use it in a managed, corporate environment where breaches aren't your problem. This goes with any fork, as well.
If you signed a Nondisclosure agreement with your employer, and you use—without approval—a tool that sends telemetry, you may be liable for a breach of the NDA.
Opening IDEA after those three days was the same kind of feeling I imagine you’d get when you take off a too tight pair of shoes you’ve been trying to run a marathon in.
ymmv, of course, but for $dayjob I can’t even be arsed trying anything else at this point, it’s so ingrained I doubt it’ll be worth the effort switching.
I see a lot of confused comments blaming Microsoft, so to clarify: This analysis is about TRAE, a ByteDance IDE that was forked from VSCode: https://www.trae.ai/
I can't prove it, but I think that's untrue. Anecdotally, I've only heard MS using it in the last 10 years or so, and it's been pretty common terminology for years before that.
Last 10 years is right - Windows 10 was when they went all-in, and that was released in 2015. Before that, "telemetry" usually referred to situations where the same entity owned both ends of the data collection, so "consent" wasn't even necessary.
Microsoft caught flack for backporting telemetry to Windows 7 in the Windows 8/8.1 era. They really started sucking down data in Windows 10 but their spying started years before that.
Yeah. One of the most frustrating things about modern gaming is companies collecting metrics about how their game is played, then publishing "X players did Y!" pages. They're always interesting, but.... why can't I see those stats for my own games?! Looking at you, Doom Eternal and BG3.
You can capture the telemetry data with a HTTPS MITM and read it yourself.
Or (if you're working lower level) you can see an obfuscated function is emitting telemetry, saying "User did X", then you can understand that the function is doing X.
> You can capture the telemetry data with a HTTPS MITM and read it yourself.
That's not helping me, the user.
That's helping me, the developer.
> Or (if you're working lower level) you can see an obfuscated function is emitting telemetry, saying "User did X", then you can understand that the function is doing X.
Is it just me or does the formatting of this feel like ChatGPT (numbered lists, "Key Takeaways", and just the general phrasing of things)? It's not necessarily an issue if you checked over it properly but if you did use it then it might be good to mention that for transparency, because people can tell anyway and it might feel slightly otherwise
Don't pay any attention to people giving you shit for using translation software. A lot of us sometimes forget that the whole world knows a little English and most of us native speakers have a ridiculous luxury in getting away with being two lazy to learn a few other languages.
I think it's good form to mention it as a little disclaimer, just so people don't take it the wrong day. Just write (this post was originally written by me but formatted and corrected with LLM since English is not my primary language).
From what I've seen, people generally do not like reading a generated content, but every time I've seen the author come back and say "I used it because it isn't my main language" the community always takes back the criticism. So I'd just be upfront about it and get ahead of it.
Part of the problems with using LLMs for translation is precisely that they alter the tone and structure of what you give it, writing using the LLM cliches and style, and it's unsurprising people see that and just assume completely generated slop. It's unfortunate, and I would probably try and use LLMs if English wasn't my first language, but I don't think it's as simple as "using translation software", I've not seen people called out in that way for dodgy Google Translate translations, for example, it's a problem specific to LLMs and the output they make having fundamental issues.
I wasn't annoyed about it, I just said it might be good to mention because people will notice anyway, and at this point there's enough AI slop around that it can make people automatically ignore it so it would be good to explain that. I'm surprised I got downvotes and pushback for this, I thought it was a common view that it's good to disclose this kind of thing and I thought I was polite about it
To be clear I think this has good information and I upvoted it, it’s just that as someone else said it’s good to get ahead of anyone who won’t like it by explaining why and also it can feel a little disingenuous otherwise (I don’t like getting other people to phrase things for me either for this reason but maybe that’s just me)
God forbid people actually learn the language they're trying to communicate in. I'd much rather read someone's earnest but broken English than LLM slop anyway.
It's disingenuous to call LLMs "translation software", and it's bad advice to say "don't pay attention those people".
Even if you don't agree with it, publishing AI generated content will exclude from ones audience the people who won't read AI generated content. It is a tradeoff one has to decide whether or not to make.
I'm sympathetic to someone who has to decide whether to publish in 'broken english' or to run it through the latest in grammar software. For my time, I far prefer the former (and have been consuming "broken english" for a long while, it's one of the beautiful things about the internet!)
I'd rather you write in broken English than filter it through an LLM. At least that way I know I'm reading the thoughts of a real human rather than something that may have its meaning slightly perturbed.
> might be good to mention that for transparency, because people can tell anyway and it might feel slightly otherwise
Devil's advocate: why does it matter (apart from "it feels wrong")? As long as the conclusions are sound, why is it relevant whether AI helped with the writing of the report?
It is relevant because it wastes time and adds nothing of substance. An AI can only output as much information as was inputted into it. Using it to write a text then just makes it unnecessarily more verbose.
The last few sections could have been cut entirely and nothing would have been lost.
Edit: In the process of writing this comment, the author removed 2 sections (and added an LLM acknowledgement), of which I referred to in my previous statement. To the author, thank you for reducing the verbosity with that.
AI-generated content is rarely published with the intention of being informative. * Something being apparently AI-generated is a strong heuristic that something isn't worth reading.
We've been reading highly-informative articles with "bad English" for decades. It's okay and good to write in English without perfect mastery of the language. I'd rather read the source, rather than the output of a txt2txt model.
* edit -- I want to clarify, I don't mean to imply that the author has ill will or intent to misinform. Rather, I intend to describe the pitfalls of using an LLM to adapt ones text, inadvertently adding a very strong flavor of spam to something that is not spam.
True, but there are many more people that speak no English, or so badly that an article would be hard to understand.
I face this problem now with the classes I teach. It's an electronics lab for physics majors. They have to write reports about the experiments they are doing. For a large fraction, this task is extraordinary hard not because of the physics, but because of writing in English. So for those, LLMs can be a gift from heaven. On the other hand, how do I make sure that the text is not fully LLM generated? If anyone has ideas, I'm all ears.
I don't have any ideas to help you there. I was a TA in a university, but that was before ChatGPT, and it was an expectation to provide answers in English. For non-native English speakers, one of the big reasons to attend an English-speaking university was to get the experience in speaking and reading English.
But I also think it's a different thing entirely. It's different being the sole reader of text produced by your students (with responsibility to read the text) compared to being someone using the internet choosing what to read.
Because AI use is often a strong indicator of a lack of soundness. Especially if it's used to the point where its structural quirks (like a love for lists) shine through.
Because AI isn't so hot on the "I" yet, and if you ask it to generate this kind of document it might just make stuff up. And there is too much content on the internet to delve deep on whatever you come across to understand the soundness of it. Obviously you need to do it at some point with some things, but few people do it all the time with everything.
Pretty much everyone has heuristics for content that feels like low quality garbage, and currently seeing the hallmarks of AI seems like a mostly reasonable one. Other heuristics are content filled with marketing speak, tons of typos, whatever.
I can't decide to read something because the conclusions are sound. I have to read the entire thing to find out if the conclusions are sound. What's more, if it's an LLM, it's going to try its gradient-following best to make unsound reasoning seem sound. I have to be an expert to tell that it is a moron.
I can't put that kind of work into every piece of worthless slop on the internet. If an LLM says something interesting, I'm sure a human will tell me about it.
The reason people are smelling LLMs everywhere is because LLMs are low-signal, high-effort. The disappointment one feels when a model starts going off the rails is conditioning people to detect and be repulsed by even the slightest whiff of a robotic word choice.
edit: I feel like we discovered the direction in which AGI lies but we don't have the math to make it converge, so every AI we make goes completely insane after being asked three to five questions. So we've created architectures where models keep copious notes about what they're doing, and we carefully watch them to see if they've gone insane yet. When they inevitably do, we quickly kill them, create a new one from scratch, and feed it the notes the old one left. AI slop reads like a dozen cycles of that. A group effort, created by a series of new hires, silently killed after a single interaction with the work.
> As long as the conclusions are sound, why is it relevant whether AI helped with the writing of the report?
TL;DR: Because of the bullshit asymmetry principle. Maybe the conclusions below are sound, have a read and try to wade through ;-)
Let us address the underlying assumptions and implications in the argument that the provenance of a report, specifically whether it was written with the assistance of AI, should not matter as long as the conclusions are sound.
This position, while intuitively appealing in its focus on the end result, overlooks several important dimensions of communication, trust, and epistemic responsibility. The process by which information is generated is not merely a trivial detail, it is a critical component of how that information is evaluated, contextualized, and ultimately trusted by its audience. The notion that it feels wrong is not simply a matter of subjective discomfort, but often reflects deeper concerns about transparency, accountability, and the potential for subtle biases or errors introduced by automated systems.
In academic, journalistic, and technical contexts, the methodology is often as important as the findings themselves. If a report is generated or heavily assisted by AI, it may inherit certain limitations, such as a lack of domain-specific nuance, the potential for hallucinated facts, or the unintentional propagation of biases present in the training data. Disclosing the use of AI is not about stigmatizing the tool, but about providing the audience with the necessary context to critically assess the reliability and limitations of the information presented. This is especially pertinent in environments where accuracy and trust are paramount, and where the audience may need to know whether to apply additional scrutiny or verification.
Transparency about the use of AI is a matter of intellectual honesty and respect for the audience. When readers are aware of the tools and processes behind a piece of writing, they are better equipped to interpret its strengths and weaknesses. Concealing or omitting this information, even unintentionally, can erode trust if it is later discovered, leading to skepticism not just about the specific report, but about the integrity of the author or institution as a whole.
This is not a hypothetical concern, there are numerous documented cases (eg in legal filings https://www.damiencharlotin.com/hallucinations/) where lack of disclosure about AI involvement has led to public backlash or diminished credibility. Thus, the call for transparency is not a pedantic demand, but a practical safeguard for maintaining trust in an era where the boundaries between human and machine-generated content are increasingly blurred.
yeah man, next time VSCode crashes and recovers your unsaved work, just remember: it knows way too much about you.
this kind of overreaction is exactly why real privacy concerns get ignored. it misses the point and just ends up misleading devs who actually care about meaningful issues.
In the OP screen share, they toggle various telemetry options on and off, but every time a setting changes, there is a pop-up that says "a setting has changed that requires a restart [of the editor] to take effect" -- and the user just hits "cancel" and doesn't restart the editor. Then, unsurprisingly, the observed behavior doesn't change. Maybe I'm dumb and/or restarting the editor doesn't actually make a difference, but at least superficially, I'm not sure you can draw useful conclusions from this kind of testing...
edit: to be clear I see that they X-out the topmost window of the editor and then re-launch from the bottom bar, but it's not obvious that this is actually restarting the stuff that matters
Tested both ways, telemetry stays the same, the prompt is to restart IDE but i wanted to disable both Telemetry options before i do it.
Thanks for watching and catching that. It seems like a major oversight for the core claim: That disabling telemetry doesn’t work. If a restart is required and the tests ignored the restart warning that would invalidate the tests.
Either way, it’s useful to see the telemetry payloads.
See the authors response. He or she says it doesn’t matter either way
https://news.ycombinator.com/item?id=44706580
There's also the Eclipse VScode-look-alike-reimplementation called TheiaIDE
https://theia-ide.org/
It was rough a few years ago, but nowadays it's pretty nice. TI rebuilt their Code Composer Studio using Theia so it does have some larger users. It has LSP support and the same Monaco editor backend - which is all I need.
It's VSCode-with-an-Eclipse-feel to it - which might or might not be your cup of tea, but it's an alternative.
> Try Theia IDE online
click
> Please login to use this demo
close tab
Agreed not the most well thought landing page, but the explore page gives a good insight of how it’s being used and what it looks like: https://theia-ide.org/theia-platform/
(Scroll down to Selected Tools based on Eclipse Theia)
The feature that keeps me from moving off of vscode is their markdown support. In particular the ability to drag and drop to insert links to files and images *. Surprisingly, no other editor does this even though I use it all the time.
* https://code.visualstudio.com/Docs/languages/markdown#_inser...
It's also a good alternative to Obisdian if you don't need smartphone support.
https://dendron.so/ is more or less Obsidian in VSCode, and free and open source.
But Dendron is a zombie project.
I don’t mind project being done and in maintenance mode. But I am not investing my time into starting using it.
Getting started page has screenshots broken on AWS.
I took the plunge and don't regret it. Despite the condition of the web site the extension is very useful and relatively free of any annoying bugs.
At some point in time, I'd like to take the time to invest building a custom version of the extension to bump dependencies to access more modern support for plantuml/mermaid diagrams.
Obsidian supports this. (Or at least, it supports pasting an image from clapboard so I’m assuming drag and drop works too.)
Interesting.
I belong to the class of people who believe in customising their tools as they please. So I'd have written an Emacs package to do this. But then again, this is Emacs, so someone's probably already done it. Oh, here it is: https://github.com/mooreryan/markdown-dnd-images
Thank you! The timing of this comment is perfect
But if I'm not wrong here, this is also just the VS Code / Electron still?
It is electron and monaco (the text editor itself), but there is a lot more to VS Code / Theia than this two parts.
Yeah , INSEAD of forking vscode which is not modification friendly they should justuse theia because it is maintained to be modular and allowed to be used like a Library to build IDEs of your choice.
Whoever disagreed and downvoted can you explain me why?
Google Cloud Shell is also Theia. I think it is fairly popular.
Eclipse (as in ecosystem) is fairly popular in Enterprise, but since it exposes all the knobs, and is a bona fide IDE which has some learning curve, people stay away from it.
Also it used to be kinda heavy, but it became lighter because of Moore's law and good code management practices all over the board.
I'm planning to deploy Theia in its web based form if possible, but still didn't have the time to tinker with that one.
Also to note that VSCode main architect was one of Eclipse architects, and co-author on GoF famous book, Erich Gamma.
Didn't know that. Now, that's interesting.
Using Eclipse as "the Java LSP" in VSCode makes more sense now.
Nevertheless, as much as I respect Erich for what he did for Eclipse, I won't be able to follow him to VSCode, since I don't respect Microsoft as much.
So not also using Github, LinkedIn, TypeScript (any FE framework that uses it), any Microsoft owned studios games, no Linux kernel contributions, GHC contributions,....
It is kid of hard to avoid nowadays.
Here a session with him related to VSCode history,
"The Story of Visual Studio Code with Erich Gamma and Kai Maetzel"
https://www.youtube.com/watch?v=TTYx7MCIK7Y
My personal code doesn't get uploaded to GitHub anymore, and I open my LinkedIn twice a year or so.
I don't do Web Development, I live in the trenches. Since I don't own a desktop system anymore, I don't honestly game.
I'm exposed to them via systemd and Linux Kernel, yes, but at least both are licensed with GPL.
At least I'm trying to minimize my exposure.
For more context, please see https://news.ycombinator.com/item?id=44634786
Thanks for the video, btw. I'll take a look the moment I have time.
Theia is different from eclipse IDE it's written in JS not in Java and didn't share any code base of eclipse which is fully Java
Yes, I know.
This is why I used "(as in ecosystem)" in the first paragraph. It was a bit late when I wrote this comment, and it turned out to be very blurry meaning wise.
My bad.
eclipse still is alive holy shit
Installing the VSCode extension pack for Java runs a headless version of Eclipse JDT under the hood, which isn’t quite what I think of as lightweight.
What's wrong with that? If they re-implement the whole thing it would amount to the same code size. It's the JDT language SERVER not some sort of "headless" software with UI needlessly bundled.
https://marketplace.visualstudio.com/items?itemName=redhat.j...
Java isn't quite what I think of as lightweight. I mean it probably can be, but most Java engineering I know of is all about adding more and more libraries, frameworks, checks, tests, etc.
You can set the launchMode to LightWeight which spins up a syntax-only language server.
I would be interested to see a similar analysis of ByteDance's video editor, CapCut (desktop version). The editor itself is amazing, IMO it has the best UI of any video editing software I've used. Surely, it's full of telemetry and/or spyware, though, but it would be good to know to which extent. I couldn't find any such analysis.
Great analysis, well done ! Since you've already done VSCode, Trae, Cursor, can you analyse Kiro (AWS fork). I'm curious about their data collection practices.
Anecdata but Kiro is much, much, much, much easier to put through corporate procurement compared to its peers. I'm talking days vs months.
This is not because it is better and I've seen no inclination that it would somehow be more private or secure, but most enterprises already share their proprietary data with AWS and have an agreement with AWS that their TAMs will gladly usher Kiro usage under.
Interesting to distinguish that privacy/security as it relates to individuals is taken at face value, while when it relates to corporations it is taken at disclosure value.
This seems perfectly rational. If you're already entrusting all your technical infrastructure to AWS, then adding another AWS service doesn't add any additional supply-chain risk, whereas adding something from another vendor does do that.
Am i the only one that finds
Not to really bad that obtrusive? Like i don't really see anything there that i'd be offended in them taking?I don't want any program on my computer including the OS to make any network calls whatsoever unless they're directly associated with executing GUI/CLI interactions I have currently undertaken as the user. Any exception should be opt-in. IMHO the entire Overton window of these remote communications is in the wrong place.
Yeah but this is industry standard
not saying this is good but everyone do this
It's industry standard for porn stars to get fucked on camera. But I'm not a porn star, and I don't want to be fucked on camera.
Handwaving away this abuse of privacy by saying "everyone does it because it makes money" is a gross justification.
same reason you apply at porn stars analogy
no one force you to use these tools, you can use another tools that suit your needs
if you read terms of service and privacy policy then you agree to it because you use it then company have right too
No one is forcing anyone to use Trae.
But the telemetry settings not working and the actions of the Trae moderators to quash any discussion of the telemetry is extremely concerning. People should be able to make informed decisions about the programs they are using, and the Trae developers don't seem to agree.
https://consumerrights.wiki/index.php/EULA_roofie
Then why are you in a porn movie?
To further the analogy, sex may be an industry, but not everyone who participates does so comercially. Some who do so comercially may not want to be filmed.
> Yeah but this is industry standard
we don't have to accept it. but people say "it's just how we do it" and suddenly people just accept it.
i really feel like our society is going to collapse soon, if it hasn't already begun to. the amount of total crap that people are put through just so that ads can be more targeted to users. we are creating a hellscape for privacy and freedom just so people click on ads. it is pure and complete insanity, and no one cares.
The industry is in a rather sorry state right now though. We can, and should, do better.
I would not want to share these:
Unique Identifiers: Machine ID, user ID, device fingerprints Workspace Details: Project information, file paths (obfuscated)
Plus os details.
I'd rather none.
How do you do abuse detection for free-tier without these?
Provide a light version of your app for the free tier that does not use any remote resources ofc.
Then you don't have to worry about "abuse".
Counter-abuse is hardly the answer
Same way Linux does it.
I always want the choice to be mine.
I was interested in learning Dart until the installer told me Google would be collecting telemetry. For a programming language. I’ve never looked at it again.
It can be disabled btw. And no telemetry is collected on first run.
I keep it disabled for both Dart and Flutter.
Maybe someday I’ll give it a shot, but it’s hard to undo the damage from a terrible first impression like that.
As a somewhat paranoid person I find this level of paranoia beyond me. Like do you own a car? Or a phone? A credit card? Walk around in public where there's cameras on every block? I don't agree with it at all but the world we're living it makes it impossible to not be tracked with way more than (usually anonymized) telemetry data.
> (usually anonymized) telemetry data.
Anonymization is usually a lie:
https://news.ycombinator.com/item?id=20513521
https://news.ycombinator.com/item?id=21428449
Also please stop with security/privacy nihilism, https://news.ycombinator.com/item?id=27897975
It's not nihilism. I still ad block, use an RFID wallet, and don't install any apps on my phone, I rarely use google for anything. But at some point when something is so ridiculously useful and the data they're getting doesn't really mean much of anything I have to stop caring. I use Windows 11 (gross) because it lets me play video games with my friends I can't otherwise. I use Uber because it lets me get across town. I use Visual Studio because it helps me code. I use Chatgpt because it helps me with so many things. To take away any one of those because I'm a privacy absolutist seems silly to me. It has the exact same vibe of never leaving your room because you're afraid of all the cameras.
I'd like there to be a push back against these companies because I find their practices disgusting but running linux with only open source software and a fairphone is just an extreme I'm unwilling to entertain because it's just not possible in my (or most people's) world.
> and don't install any apps on my phone, I rarely use google for anything
And yet you are fine with personalized telemetry from Google on your PC? This is self-contradictory.
In fairness, as others have pointed out, the phone is much more personal than the home computer. Your phone is almost always with you, collecting much more intimate data than your PC can.
This not always true, at least not for me. I trust my laptop with disabled Intel ME running Qubes OS much more than my phone.
I didn't say the phone was more trusted, I said it was more personal.
Your phone almost certainly knows where you are at all times, for example. It may know whether you're walking or sitting. It knows who calls you, who you call, who you message.
The laptop may know some of that, but it doesn't have the same sensors, and doesn't stay with you most times you leave the house.
> Your phone almost certainly knows where you are at all times, for example. It may know whether you're walking or sitting
Indeed, it's true for most people, but I use Librem 5 with hardware kill switches for modem, sensors etc.
I think the thing you neglect when having setups like this is that you start to garner interest from law enforcement if they ever come across you. You're trying so hard to cover your tracks that you stand out very clearly in a crowd.
There's a middle ground between living deep in the woods without windows and walking around naked in public.
It seems you are talking about Social Cooling, https://news.ycombinator.com/item?id=24627363. The more people like me exist, the easier it will be for actual activists and journalists to do their work. Privacy and anonymity are crucial for democracy.
I'm aware I'm being tracked all the time. That doesn't mean I have to encourage more of it.
We all pick our own battles.
Seems like a lot, especially after checking "disable telemetry"
"file paths (obfuscated)" -- this is likely enough for them to work out who the user is, if they work on open source software. They get granular timing data and the files the user has edited, which they could match with open source PRs in their analytics pipeline.
I suspect they aren't actually doing that, but the GDPR cares not what you're doing with the data, but what is possible with it, hence why any identifier (even "obfuscated") which could lead back to a user is considered PII.
Yes, you're the only one.
Imagine your text editor sending network data to unknown resources and there is no way to disable that
Honestly, I found this whole thread kind of strange. There’s nothing here beyond what most connected IDEs — or even basic office software — already collect by default.
It feels like the goal was more about grabbing attention than raising a real issue. But sure, toss “ByteDance” and “data” into a headline and suddenly it’s breaking news. I'm just tired of this kind of "Big News"- it's boring.
I'm not sure if you were being sarcastic, but I honestly don't think how it could possibly get any more intrusive without directly uploading files.
They don't want telemetry ever disabled, even for a minority of people who do toggle it off. Why?
Disabling telemetry might be interpreted as a self-indicated signal of "I have something to hide", so they jack up the snooping.
Or "I'm a power user" of sorts. Probably a very small minority of users fiddle that setting.
Dang said a similarly small minority of users here do all the commenting.
This is true of practically every online community. The vast majority of the users are passive participants, a small fraction contribute, and a small subset of contributors generate most of the content. Reddit is a prime example of this, the numbers are incredibly lopsided there.
See: “Most of What You Read on the Internet is Written by Insane People”
https://old.reddit.com/r/slatestarcodex/comments/9rvroo/most...
https://news.ycombinator.com/item?id=18881827
HN discussion of that link for anyone curious
I suspect many (or all) VPNs probably do secret logging. People will do their most interesting secret activities on those.
Any VPN worth its salt has gone through a security audit and/or is based in a sane country where that sort of thing is illegal.
Like Signal users. The only thing it signals is that the user is interesting. There is also a reason why it is not allowed for classified use.
This isn't true, this is the sort of toxic "if I have nothing to hide then why value privacy" ideology that got us into this privacy nightmare.
Every single person has "something to hide", and that's normal. It's normal to not want your messages snooped through. It doesn't mean you're a criminal, or even computer-saavy.
Mhhh it is not really about “nothing to hide”, it was more that if you use niche services targeted at privacy, it puts a big target on you.
Like the Casio watches, travelling to Syria, using Tor, Protonmail, etc…
When it is better in reality to have a regular watch, a Gmail with encrypted .zip files or whatever, etc.
It does not mean you are a criminal if you have that Casio watch, but if you have this, plus encrypted emails, plus travel to some countries as a tourist, you are almost certain to put yourself in trouble for nothing, while you tried to protect yourself.
And if you are a criminal, you will put yourself in trouble too, also for nothing, while you tried to protect yourself.
This was the basis of Xkeyscore, and all of that to say that Signal is one very good signal that the person may be interesting.
1. I don't really think this is true generally.
2. Using a secure, but niche, service is still more secure than using a service with no privacy.
Sure, you can argue using Signal puts a "target" on your back. But there's nothing to target, right? Because it's not being run by Google or Meta. What are they gonna take? There's no data to leak about you.
If I were a criminal, which I'm not, I'd rather rob a bank with an actual gun than with a squirt gun. Even though having an actual gun puts a bigger target on your back. Because the actual gun works - the squirt gun is just kinda... useless.
>If I were a criminal, which I'm not, I'd rather rob a bank with an actual gun than with a squirt gun. Even though having an actual gun puts a bigger target on your back. Because the actual gun works - the squirt gun is just kinda... useless
Actually, there was a case... I can't recall but it might have been in Argentina, where the robbers did explicitly use fake guns when robbing the banks because doing so still actually worked for the purposes of the robbery, and it also reduced their legal liability.
What flag does using a Casio watch raise? I have a G Shock that i love though i don't wear it as often nowadays.
It could put you in Guantanamo Bay: https://en.m.wikipedia.org/wiki/Casio_F-91W#Usage_in_terrori...
The F-91W has long been associated with terrorism for it's use in improvised time-bombs: https://en.wikipedia.org/wiki/Casio_F-91W#Usage_in_terrorism
Oh so now we target people for buying inexpensive things? "Oh no! This guy doesn't have a $500 watch".
Yeah, I think it's pretty stupid, especially considering the F91-W is one of the most common and best selling watches in the world.
almost as if it was a trap... ;p
It's a dark pattern called "placebo controls" - giving users the illusion of choice maintains positive sentiment while maximizing data collection, and avoids the PR hit of admitting telemetry is mandatory.
Telemetry toggles add noise to the data at the very least. IMO it's part of the reason you're actually better off with no client-side telemetry at all. Obviously they see it the opposite way.
That's assuming this is intended behaviour rather than just a bug that they don't care about fixing.
Occam's razor. Not Hanlon's.
Great write up OP!
Your analysis is thorough, and I wonder if their reduction of processes from 33 to 20...(WOW) had anything to do with moving telemetry logic elsewhere (hence increased endpoint activity).
What does Bytedance say regarding all this?
I so much like the fact that I've come back to TUI (helix editor) recently.
I'm trying ZED too, which I believe as a commercial product comes with telemetry too.. but yeah, learning advanced rules of a personal firewall always helpful!
How are you finding it compares to just using [Neo]Vim with all the plugins and custom configs? What improvements does it offer?
Two thoughts:
1. Try using pi-hole to block those particular endpoints via making DNS resolution fail; see if it still works if it can’t access the telemetry endpoints.
2. Their ridiculous tracking, disregard of the user preference to not send telemetry, and behavior on the Discord when you mentioned tracking says everything you need to know about the company. You cannot change them. If you don’t want to be tracked, then stay away from Bytedance.
Why use pihole? Most OSes have a hosts file you can edit if you're just blocking one domain.
Hate to break it to you, but /etc/hosts only works for apps that use getaddrinfo or similar APIs. Anything that does its own DNS resolution, which coincidentally includes anything Chromium-based, is free to ignore your hosts file.
But pi-hole seems equally susceptible to the same issue? If you're really serious about blocking you'd need some sort of firewall that can intercept TLS connections and parse SNI headers, which typically requires specialized hardware and/or beefy processor if you want reasonable throughput speeds.
I configured my router to redirect all outbound port 53 udp traffic to adguard home running on a raspberry pi. From the log, it seems to be working reasonably enough, especially for apps that do their own dns resolution like the netflix app on my chromecast. Hopefully they don't switch to dns over https any time soon to circumvent it.
DNS over https depends on the ability to resolve the DoH hostname via DNS, which is blockable via PiHole, or depend on a set of static IPs, which can be blocked by your favorite firewall.
A sufficiently spiteful app could host a DoH resolver/proxy on the same server as its api server (eg. api.example.com/dns-query), which would make it impossible for you to override DNS settings for the app without breaking the app itself.
or it wouldn't even need to use any sort of dns. bit of a silly discussion.
You can’t just intercept tls, unless you can control the certificate store on the device.
In the context of snooping on the SNI extension, you definitely can.
The SNI extension is sent unencrypted as part of the ClientHello (first part of the TLS handshake). Any router along the way see the hostname that the client provides in the SNI data, and can/could drop the packet if they so choose.
Would it also be true for DNS over HTTPS right.
When the nefarious actor is already inside the house, who knows to what lengths they will go to circumvent the protections? External network blocker is more straightforward (packets go in, packets go out), so easier to ensure that there is nothing funny happening.
On Apple devices, first-party applications get to circumvent LittleSnitch-like filtering. Presumably harder to hide this kind of activity on Linux, but then you need to have the expertise to be aware of the gaps. Docker still punches through your firewall configuration.
Set up your router to offer DNS through pihole and everything in your network now has tracking and ads blocked, even the wifi dishwasher.
Until everything starts using DoH (DNS over HTTPS). There is pretty much no reason to use anything else as a consumer nowadays.
In fact, most web browsers are using DoH, so pihole is useless in that regard.
You can make pihole work with DoH:
https://docs.pi-hole.net/guides/dns/dnscrypt-proxy/
Not true, see answer above. Block the domain name or IP addresses of the DoH server.
You can disable that
Sure, but the industry is moving in that direction so prepare for the uphill battle.
Even the dishwasher that has Wifi that you don't know has Wifi and will happily jump onto open networks or has a deal with xfinity
So that these domains are automatically blocked on all devices on a local network. Also, you can't really edit the hosts file on Android or iOS, but I guess mobile OSes are not part of the discussion here.
Although there are caveats -- if an app decides to use its own DNS server, sometimes secure DNS, you are still out of luck. I just recently discovered that Android webview may bypass whatever DNS your Wi-Fi points to.
The hosts file doesn't let you properly block domains. It only lets you resolve them to something else. It's the wrong tool for the job.
If you have multiple devices on the same LAN, all of them will use the pihole.
Are there any other companies I should worry about for tracking?
Meta is pretty much number one, Google is pretty much number two. Whoever number three is, they are very far behind.
For what it's worth, I do use Google products personally. But I won't go near Facebook, WhatsApp, or Instagram.
Microsoft is definitely not that far behind in scale. They own a ton of software and services that are used by basically everyone.
Yes.
Yeah, that was my point. I'm not sure what's so breath taking about what ByteDance is doing. I'm not a fan. But, with Meta, Google, Microsoft and I'll throw on Amazon, a huge chunk of the general public's web activity is tracked. Everywhere. All the time. The people have spoken, they are okay with being tracked. I've yet to talk with a non-technical person who was shocked that their online activity was tracked. They know it is. They assume it is. ByteDance's range of telemetry does not matter to them. Just wanna keep on tiktok'ing. Why does telemetry sent to Bytedance matter? Is it a China thing? I'm not concerned about a data profile on me in China. I'm concerned about the ones here in the US. I'll stop. I'm not sure I have a coherent point.
> I'm not sure I have a coherent point.
Please see hn guidelines: https://news.ycombinator.com/newsguidelines.html
Luckily for you (and many others) there is no requirement that points be coherent.
I can’t really speak to the DNS blocking approach you mentioned, but as a regular user in the Trae community, I do want to clarify one thing:
The Discord timeout occurred because anti-ads automod was triggered by crypto-related keywords. I saw the community moderator already explained.
I hope you can know the truth rather than be misled.
I can also suggest OpenSnitch or Portmaster to anyone whose conscious about these network connections. I couldn't live without them, never trust opt-outs.
Okay, no where it's mentioned which IDE is talked about. https://www.trae.ai/ OR https://traeide.com/
Would be sad if wrong one is murdered.
Naming is hard but if there really were 2 different AI IDEs with nearly identically name that's no accident.
But it seems like traeide.com is in the best case someones extremely misleading web design demo, worst case a scam.
One the traeide website:
> Educational demo only. Not affiliated with ByteDance's Trae AI. Is Trae IDE really free? What's the catch? Yes, Trae IDE is completely free with no hidden costs. As a ByteDance product, it is committed to making advanced AI coding tools accessible to all developers.
By the way TRAE isn't free anymore, they now provide a premium subscription.
If the later really is just a web design demo it has a bunch of red flags. Why the officially sounding domain? Download links for executables!!! If it is just a web design demo for portfolio -> why are there no contact information for the author whose work and skills it's supposed to advertise?
trae.ai
Nope, the other one.
Why would anyone use a "Bytedance VSCode fork" is beyond me
It's cheap, the ai features cost about half of what other editors are charging ($10/mo) and the free tier has generous limit. I guess you pay the difference with something else :)
I’m one of those who use it—mainly because it’s cheap, as others have mentioned. I wish Cursor offered a more generous limit so I wouldn’t need another paid subscription. But it won’t. So Trae comes in — fulfilling that need and sealing the deal. This is what we call competition: it brings more freedom and helps everyone get what they want. Kudos to the competition!
I'm not defending Trae’s telemetry — just pointing out the hard truth about why pricing works and why many people care less about privacy concerns (all because there are no better alternatives for them, considering the price.)
By the way, for those who care more about pricing($7.5/M) — here you go: https://www.trae.ai/. It’s still not as good as Cursor overall (just my personal opinion), but it’s quite capable now and is evolving fast — they even changed their logo in a very short time. Maybe someday it could be as competitive as Cursor — or even more so.
> Why would anyone use a "Bytedance VSCode fork" is beyond me
Because the person using it works at Bytedance.
I guess your question is better phrased as: “Why would any non-Bytedance employee use Bytedance VSCode fork?”, to which I have no answer.
Except their own employees, of course. Apparently the main difference this has with MS' version is additional "AI features", so I'm not surprised...
I wonder how many of these telemetry events can sneakily exfiltrate arbitrary data like source code. For example, they could encode arbitrary data into span IDs, timestamps (millisecond and nanosecond components), or other per-event UIDs. It may be slow...but surely it's possible.
Telemetry isn't the same thing as spying on user data. You should read below if you are not clear about the differences:
What is telemetry data?
Telemetry is anonymous, aggregated usage data that helps developers understand how a product is being used. It usually includes stuff like: - What features are being used (e.g. “X% of users use the terminal panel daily”) - How often the app crashes, and where - Performance stats (e.g. memory usage, load times) - Environment info (like OS version, screen resolution — not personal files)
It does NOT include your source code, files, passwords, browser content, or any personal identifiers unless explicitly stated. And in most reputable products, it’s either anonymized or pseudonymized by default.
Who uses telemetry? Everyone. - VS Code collects telemetry to improve editor performance and user experience. For products that are vscode fork, they inherit the vscode telemetry switch by default. - Chrome uses telemetry to understand browser performance, crashes, and feature adoption, -Slack, Discord, Postman all rely on telemetry to debug, prioritize features, and improve product quality
Without telemetry, you can’t know which features are working, where users are getting stuck and what’s actually causing bugs.
So when people say “telemetry = privacy breach,” they’re confusing helpful system analytics with data exploitation. The real concern should be around what is collected, how it’s stored, and whether users can opt out — not the mere existence of telemetry data itself. Telemetry data itself doesn’t directly cause high CPU usage. CPU hog could cause by a lot of different reasons.
Sharing here just because some reasonings and arguments in the original post seems off. Don’t want people to get confused
telemetry is not your source code.
It doesn’t include: - Your code files - What you’re writing - proprietary IP - Passwords, tokens, env vars - Anything personally identifiable
Instead, telemetry usually looks like “User clicked on send button” or “App crashed with error XYZ on macOS 14.1”
https://github.com/bytedance/trae-agent does not appear to have any telemetry.
Why do people use obvious spyware when free software exists?
Well there's a middle ground - Sublime Text isn't free but it's fantastic and isn't sending back all my code/work to the Chinese Government. Sorry, "Telemetry"
And the other side of the middle ground, Grafana being AGPL but requiring you to disable 4 analytics flags, 1 gravatar flag, and (I think) one of their default dashboards was also fetching news from a Grafana URL.
https://github.com/grafana/tempo/discussions/5001#discussion...
(Yes, that's for Grafana tempo, but the issue in `grafana/grafana` was just marked as duplicate of this.)
Another middle ground is CudaText, it is free and without telemetry.
Yes, why do people use products from Microsoft, Apple, Google, Amazon, ...
> Yes, why do people use products from Microsoft, Apple, Google, Amazon, ...
I work at Apple, so I’m not concerned about being monitored—it’s all company-owned equipment and data anyway.
It was the same when I worked at Microsoft. I used Microsoft products exclusively, regardless of any potential privacy concerns.
Employees at Google and Amazon do the same. It’s known as “dogfooding”—using your own products to test and improve them (https://en.wikipedia.org/wiki/Eating_your_own_dog_food).
As for why people outside these companies use their products, it usually comes down to two reasons: a) Their employer has purchased licenses and wants employees to use them, either for compliance or to get value from the investment; or b) They genuinely like the product—whether it’s because of its features, price, performance, support, or overall experience.
Hmm. Are you aware that I was responding to this comment?
> Why do people use obvious spyware when free software exists?
So, even though the poster was referring to ByteDance when they said "obvious spyware", I was feigning incomprehension in order to ask the question, how do we differentiate ByteDance from what Microsoft, Apple, Google, Amazon (and the rest) do.
It's a real question - why do technical people, who arguably should know better, and can do something about it - continue to use these data-harvesting and user-selling platforms? The answer is obvious when it's the case of an employee of those companies, I grant you that.
My apologies if you feel your response did address that, and I missed it. If so, please help me see what I missed.
Because the alternatives suck.
In this case, the software being analyzed is the alternative that sucks.
If we are talking about telemetry…. Seriously all these products are sending telemetry data
And the Snowden revelations happened, which programmers and sysadmins and etc saw, and then... continued as before, in the large majority of cases. It'd be baffling, if it wasn't so easily explained by the usual mixture of self-interest and moral cowardice.
Because there’s a huge amount of money behind smearing free software
As an asset who successfully infiltrated a rival country's tech company you want deniability. Bringing your own IDE does not look suspicious.
Telemetry isn't the same thing as spying on the user. People use it because it's not actually spying on them.
It is literally spying on the user.
Unless you're somehow saying telemetry doesn't report anything about what a user is doing to it's home server.
Spying and telemetry is not something specific to Bytedance. Example: Google ? Or Microsoft ? Why is it a problem only when it is Bytedance or Huawei ? For the exact same activity
In fact the Chinese entities are even less likely to share your secrets to your governement than their best friends at Google
No one in the chain of comments you are replying to has mentioned anything about Google, and on HackerNews you will find the majority sentiment is against spying in all forms - especially by Google, Meta, etc.
Even if we interact with your rhetoric[1] at face value, there is a big difference between data going to your own elected government versus that of a foreign adversary.
[1] https://en.wikipedia.org/wiki/Whataboutism
So you are implying at the end that it is better that your secrets (“telemetry”) go to your local agencies and to possible relatives or family who work on Gmail, Uber, etc ?
Yes, naturally I trust my own elected government, or possible relatives/family, far more than I trust a foreign adversary
I'm sorry but why? Your government can use this data to actually hurt you and put you on the no-fly list, or even put you in prison.
But a foreign government is limited to what it can do to you if you are not a very high-value target.
So I try as much as possible to use software and services from a non-friendly government because this is the highest guarantee that my data will not be used against me in the future.
And since we can all agree that any data that is collected will end up with the government some way or another. Using forging software is the only real guarantee.
Unless the software is open source and its server is self-hosted, it should be considered Spyware.
My comment has nothing to do with a specific company but about telemetry and spying on the customer.
"What about Google" is not a logical continuation of this discussion
> Why is it a crime only when it is ByteDance or Huawei ?
It should be a crime for Google as well.
"Whataboutism" is a logical fallacy.
https://en.wikipedia.org/wiki/Whataboutism
In my mind, the difference is that spying does or can contain PII, or PII can be inferred from it, where telemetry is incapable of being linked to an individual, to a reasonable extent.
In my mind, any feature collecting information about me, truly anonymized or not is spying if it's opt out.
Every single piece of telemetry sent over the internet includes PII - the IP address of the sender - by virtue of how our internet protocols are designed.
> includes PII - the IP address of the sender
Apple provides telemetry services that strips the IP before providing it to the app owners. Routing like this requires trust (just as a VPN does), but it's feasible.
You said it's different from spying because there is no PII in the information. Now you're saying it's different because it's not given to app owners.
Why is it relevant whether they provide it to app owners directly? The issue people have is the information is logged now and abused later, in whatever form.
Which has clear logically consistency, at the app owner level, which is the context of my reply.
If the app owner can't obtain PII, I don't believe the app owner is spying.
Is Apple spying?
> Routing like this requires trust
It depends on if you trust them, and their privacy policy. If they're functioning as a PII stripping proxy, as they claim, then I would claim no, to the extent of what's technically possible. I would also claim that a trustworthy VPN is not spying on you. YOMV.
This is like saying every physical business is collecting PII because employees can technically take a photo of a customer. It's hard to do business without the possibility of collecting PII.
No, it's like saying a business that has a CCTV camera recording customers, and sending that data off site to a central location, where they proceed to proceed to use the data for some non-PII-related purpose (maybe they're tracking where in stores people walk, on average), are in fact sending PII to that off site location.
Distinguishing factors from your example include
1. PII is actually encoded and handled by computer systems, not the mere capability for that to occur.
2. PII is actually sent off site, not merely able to be sent off site.
3. It doesn't assert that the PII is collected, which could imply storage, it merely asserts that it is sent as my original post does. We don't know whether or not it is stored after being received and processed.
I was giving a purely physical, analog example.
If you imagine the CCTV camera in my example is a film-video-camera and the processing happening off site is happening in a dark room and not on a computer... my more accurate version of your analogy is also analog.
At least spiritually not if the traffic is routed over a Tor circuit. :-)
Unless you control most of the Tor nodes :-)
So many US universities running such nodes, without ever getting legal troubles. Such lucky boys
I think "spying" implies "everywhere possible", including, outside the app
If anything it is spying on the application itself. This is limited in scope compared to spyware which is software which spies on users themselves.
Those who collect PII, anonymized or not, are collecting information for one or more legitimate purposes, and that same information lends itself to ends which can reasonably be construed as spying when it is inevitably exposed to those who desire to spy. Those app developers can’t plausibly deny knowing that this information sharing will occur or is exceedingly likely to occur, and by making such data collection opt-out, app developers knowingly are acting on behalf of spies, despite having no intention to directly spy themselves. If you are an app developer with opt-out telemetry or an end user of an app so developed, who is the spy or doing the spying is a distinction without a difference to my view.
Anonymized or not, opt-out telemetry is plain spying. Go was about to find out, and they backed out the last millisecond and converted to opt-in, for example.
Unfortunately opt-in telemetry is like no telemetry at all. Defaults matter.
No telemetry at all is a good thing to some (most?) people.
Telemetry can be implemented well. The software you use gets bugs fixed much faster since you get statistics that some bugs have higher impact than others. The more users software has, less skills they have in average to accurately report any issues.
> The software you use gets bugs fixed much faster since you get statistics that some bugs have higher impact than others.
Try talking to your users instead.
> The more users software has, less skills they have in average to accurately report any issues.
No amount of telemetry will solve that.
The PowerShell team at Microsoft added opt-out telemetry to track when it was launched so they could make the case internally that they should get more funding, and have more internal clout.
It’s easy to argue that if you are a PowerShell user or developer you benefit from no telemetry, but it’s hard to argue that you benefit from the tool you use being sidelined or defunded because corporate thinks nobody uses it. “Talk to your users” doesn’t solve this because there are millions of computers running scripts and no way to know who they are or contact them even if you could contact that many people, and they would not remember how often they launched it.
https://learn.microsoft.com/en-us/powershell/module/microsof...
> it’s hard to argue that you benefit from the tool you use being sidelined or defunded because corporate thinks nobody uses it.
Let the corporation suffer then. With an open API, a third party will make a better one. Microsoft can buy that; corporations have a habit of doing that.
> “Talk to your users” doesn’t solve this because there are millions of computers running scripts
Why are you worried about the problems that scripts face? If the developer encounters issues in scripts, the developer can work to fix it. Sometimes that might mean filing a bug report... or a feature request for better documentation. Or the developer might get frustrated and use something better. Like bash.
> there are millions of computers running scripts and no way to know who they are or contact them
Why do they matter to you, or a corporation then?
> they would not remember how often they launched it.
If your users aren't interacting with you for feature requests and bug reports, then either you don't have users or you don't have good enough reachability from the users to you.
> "use something better. Like bash."
Bash isn't better.
> "Why are you worried about the problems that scripts face? Why do they matter to you?"
because I write and run such scripts.
> "Let the corporation suffer then"
Microsoft wouldn't suffer, PowerShell users would suffer.
> "sometimes that might mean filing a bug report... or a feature request for better documentation. "
In this scenario the PowerShell team has been defunded or sacked. Who will the bug report go to? Who will implement the feature request?
> "If your users aren't interacting with you for feature requests and bug reports, then either you don't have users or you don't have good enough reachability from the users to you."
Users are interacting with Microsoft for feature requests and bug reports. There are a thousand open issues on https://github.com/powershell/powershell/ and many more which were closed "due to inactivity". What difference does that make if Corporate doesn't want to fund a bigger team to fix more bugs unless it can be shown to benefit a lot of customers not just "a few" devs who raise issues?
> Bash isn't better.
It is, by virtue of running on Linux.
> because I write and run such scripts.
'kay. Learn how to do Engineering and the software will come just fine. You don't need telemetry to tell you anything about scripts. You need good error reports for your users to send to you instead.
> Microsoft wouldn't suffer, PowerShell users would suffer.
So what you're saying is that Microsoft doesn't care about its users. PowerShell users should use products from better companies then.
> In this scenario the PowerShell team has been defunded or sacked. Who will the bug report go to? Who will implement the feature request?
Why were they sacked?
Oh, right, because they didn't interact with their users.
Who will the bug report go to? Clearly it's the same as before: nobody. That's a Microsoft problem.
> What difference does that make if Corporate doesn't want to fund a bigger team to fix more bugs unless it can be shown to benefit a lot of customers not just "a few" devs who raise issues?
If Corporate doesn't want to fund bugfixes and features for people who actually file bug reports and talk to you, then that's poor behavior of corporate. Why do you want to contribute to the decline of your users privacy?
>Let the corporation suffer then.
Corporations provide value to others. It's not just the corporation that is missing out.
Corporations provide value to their shareholders. The things they sell and their customers are the product. They care about neither.
This is a systemic problem on Microsoft's side, it's not an upside of telemetry.
To be clear, I consent to send telemetry from some of the tools I use and deploy.
Their common pattern? They wait a bit, and ask nicely about whether I want to participate. Also, the dialog box asking the question defaults to off.
I read the fine print, look a the data they push, ponder and decide whether I'm cool with it or not.
Give me choice, be upfront and transparent. Then we can have a conversation.
To take that logic to its extreme: I'm sure we could have amazing medical breakthroughs if we just gave up that pesky 'don't experiment on non-consenting humans' hang-up we have.
The parent said "talk to your users instead of telemetry" and I said "there are scenarios where telemetry can get information that you cannot get by talking to users". How did you go from that to "experimenting on non-consenting humans"?
To take your logic to its extreme, you have a disease and are prescribed pills, and the pharmaceutical company says "we will track when you take the pills - unless you don't want us to?" and you would prefer the researchers get shut down for not knowing whether anyone actually takes the pills, and an unlimited number of people die from treatable diseases that don't get cured.
Medical research and consent doesn't work like this. If you track your patients without their consent, or you share their data without their explicit consent, you'll land in very hot water, which will cook you even before you can scream.
Similarly, a medical trial will take a very detailed consent before you can start.
Your opt-out telemetry is akin to your insurance sending you powered and Bluetooth enabled toothbrushes out of the blue to track you and threaten to cancel your insurance if you don't use that toothbrush and send data to them.
Or as a more extreme example, going through an important procedure not with the known and proven method but with an experimental one, because you didn't opt-out and nobody bothered to tell you this. In reality, you need to sign consent and waiver forms to accept experimental methods.
> "Medical research and consent doesn't work like this."
Yes, I agree that person's comparison to non-consensual medical research is stupid.
> "Your opt-out telemetry is akin to your insurance sending you powered and Bluetooth enabled toothbrushes out of the blue to track you and threaten to cancel your insurance if you don't use that toothbrush and send data to them."
More akin to your insurance company making a public RFC where you can discuss the coming telemetry, then you choosing to ask your insurance for an optional toothbrush, being able to opt out of telemetry if you want to, the insurance company documenting how to opt out[1], you being able to edit the toothbrush source code to remove the telemetry entirely with the insurance company's approval because it's MIT licensed, and absolutely nothing happening to you if you opt out.
> I don't understand how you got from "there are scenarios where telemetry can get information that you cannot get by talking to users, here is one example" to "experimenting on non-consenting humans". What is the connection?
The connection is clear if your salary doesn't require you to not understand it.
Developers don't opt-in to telemetry? Maybe it's because they don't want to enable that telemetry, your experiments be damned.
Use proper engineering to demonstrate that your scripts work instead of demanding that users be your free software test team.
> "Use proper engineering to demonstrate that your scripts work instead of demanding that users be your free software test team."
This telemetry is not about demonstrating that scripts work, as I have said to you multiple times.
You said 'but we wouldn't have a lot of improvement without telemetry'. I am saying that we could have a lot of improvement in a lot of things if we wanted. We could have breakthroughs in medicine if we allowed human experimentation. The question is, where is that that line? Your argument doesn't address that, it just tries to justify something that people think it morally wrong by stating that we get use from it.
> "You said 'but we wouldn't have a lot of improvement without telemetry'."
I did not say that. Within the context of Microsoft's internal funding, maybe, but we could have the same improvement by Microsoft throwing more money at the PowerShell team without this telemetry. The core thing I said was that the information the telemetry gets cannot be got by "talk to your users" not that the telemetry leads to amazing improvements.
It is still difficult for you to make the case that someone choosing to download PowerShell can be "not consenting" (and before you reply saying "PowerShell ships with Windows", the PowerShell which has telemetry does not [yet] ship with Windows).
Exactly. What users do on their computers is their own data. It's up to them to share it or not.
Surely that should be fortunately.
Any monitoring of my system without my explicit permission is spying.
Ha ha, free software also has tons of telemetry, it just belongs to GitHub.
Eh, I don't know how you could tell it is "obvious" "spyware", unless you are referring to the fact that it comes from Bytedance.
The mere fact that disabling telemetry does not at all disable telemetry is enough for it to be called spyware.
That's only after you read this article. The question is how do you know it's spyware even before you install it. At least it's not clear for me from github README file that I would be knowningly installing spyware.
Have Bytedance produced literally anything to that assumption unreasonable?
Amongst other things they do have a division that produces OKR tracking software. Just the weird story of another multinational I suppose.
As in objective and key result? That’s… weird but ok!
I remind you, again, that vi, gcc, as, ld, and make have no telemetry, launch few (if any) processes, do not need GB of RAM, and work well.
I am a die hard VIM user. VIM is a text editor, not an IDE. You can I many D tools into it's E, but it remains a text editor with disparate tools.
What is there in an IDE today, that is missing from (n)vim? With the advent of DAP and LSP servers, I can't find anything that I would use a "proper" IDE for.
- debuggers
- popup context windows for docs (kind of there, but having to respect the default character grid makes them much less capable and usually they don't allow further interaction)
- contextual buttons on a line of code (sure, custom commands exist, but they're not discoverable)
- "minimap"
Don't IDEs use DAP as well? That would mean neovim has 1:1 feature parity with IDEs when it comes to debugging. I understand the UI/UX might need some customization, but it's not like the defaults in whatever IDE fit everyone either.
Popup context windows for docs are super good in neovim, I would make a bet that they are actually better than what you find in IDEs, because they can use treesitter for automatic syntax highlighting of example code. Not sure what you mean with further interaction.
Contextual buttons are named code actions, and are available, and there are like 4 minimap plugins to choose from.
> That would mean neovim has 1:1 feature parity with IDEs when it comes to debugging.
How do I get a memory graph with custom event markers overlayed on it then? That's the default for VS for example.
Sorry I don't know enough about VS to answer this, but if the debugger in question is using DAP then there is a more than fair chance it's available in neovim as well.
The "minimap" is the only one here that isn't native. You can also have the file tree on the left if you want. Most people tend to use NerdTree[4], but like with a lot of plugins, there's builtins that are just as good. Here's the help page for netrw[5], vim's native File Explorer
Btw, this all works in vim. No need for neovim for any of this stuff. Except for the debugger, this stuff has been here for quite some time. The debugger has been around as a plugin for awhile too. All this stuff has been here since I started using vim, which was over a decade ago (maybe balloons didn't have as good of an interface? Idk, it's been awhile)
[0] https://vimdoc.sourceforge.net/htmldoc/debugger.html
[1] https://vimdoc.sourceforge.net/htmldoc/options.html#'balloon...
[2] https://github.com/wfxr/minimap.vim
[3] https://github.com/preservim/tagbar
[4] https://github.com/preservim/nerdtree
[5] https://vimhelp.org/pi_netrw.txt.html#netrw
> These are called "balloon"
And are not interactive as far as I know. I've not seen a way to get a balloon on the type in another balloon and then go to the browser docs from a link in that.
> Do you mean something like this?
Yes, but that's still restricted to terminal characters (you could probably do something fancy with sixel, but still) - for larger files with big indents it's not useful anymore.
> contextual buttons on a line of code
For example options to refactor based on the current location. I could construct this manually from 3 different pieces, but this exists in other IDEs already integrated and configured by default. Basically where's the "extract this as named constant", "rename this type across the project" and others that I don't have to implement from scratch.
Second, sure, I refactor all the time. There's 3 methods I know. The best way is probably with bufdo and having all the files opened in a buffer (tabs, windows, or panes are not required). But I'm not sure why this is surprising. Maybe you don't know what ctags are? If not, they are what makes all that possible and I'd check them out because I think it will answer a lot of your questions.
Correct me if I'm wrong, but you are asking about "search and replace" right? I really do recommend reading about ctags and I think these two docs will give you answers to a lot more things that just this question[0,1]. Hell, there's even The Primeagen's refactoring plugin in case you wanted to do it another way that's not vim-native.But honestly, I really can't tell if you're just curious or trying to defend your earlier position. I mean if you're curious and want to learn more we can totally continue and I'm sure others would love to add more. And in that case I would avoid language like "vim doesn't" and instead phrase it as "can vim ___?", "how would I do ____ in vim?", or "I find ___ useful in VS code, how do people do this in vim?" Any of those will have the same result but not be aggressive. But if you're just trying to defend your position, well... Sun Tzu said you should know your enemy and I don't think you know your enemy.
[0] https://vim.fandom.com/wiki/Browsing_programs_with_tags
[1] https://vim.fandom.com/wiki/Search_and_replace_in_multiple_b...
[2] https://github.com/ThePrimeagen/refactoring.nvim
> you use completion, right? That's interaction?
Very basic one. What I mean is once you get the completion, how do you interact with that view - let's say you want to dig into a type that's displayed. Then you want to get to the longer docs for that type. There's nothing out there that does it as far as I know.
> Wait... you want it as an image?
Yes, the asciiart minimaps are cool, but they really don't have enough resolution for more complex longer files in my experience.
> The best way is probably with bufdo and having all the files opened in a buffer
You see why this is not great, right? That's an extra thing to think about.
> Maybe you don't know what ctags are?
I know. It's step 1 out of many for implementing proper refactoring system.
> but you are asking about "search and replace" right?
Search and replace with language and context awareness. You can diy it in vim or start stacking plugins. Then you can do the same with the next feature (like inserting method stub). But... I can just use an actual IDE with vim mode instead.
> And in that case I would avoid language like "vim doesn't"
Vim doesn't do those things though. There's a whole ecosystem of additions of plugins of the day that add one thing or another. But it turns out it's easier to embed nvim in an ide than play with vim plugins until you get something close to ide. Been there for years, done that, got tired. VS with vim mode has better ide features than vim with all the customised plugins.
And it's not very hard to remember things like bufdo, windo, tabdo because I'm already familiar with a buffer, tab, and window. It's not an extra item in memory for me, so no, I don't see. It's just as easy and clear as if I clicked a button that said "do you all files"
You mean ins-completion? That's native. I can complete things from other files (buffers), ctags, and whatever. You can enable auto suggest if you really want but that's invasive for me and distracting. But to each their own. I mean the right setup is only the right setup for you, right? Yet I'm really not sure what's missing. I'll give you the minimap but I personally don't really care about that one. Is it that big of a deal? (I already know what percentage of the file I'm in and personally I'd rather the real estate be used for other things. But that's me). But so far this conversation has been you telling me vim doesn't do something, me showing you it does, and you just saying no. To me it just sounds like you don't know vim. It's cool, most people don't read docs ¯\_(ツ)_/¯I mean there's a lot of stuff that people that have been using vim for years don't know but is in vimtutor. I mean how many people don't know about basic things like ci, competition (including line or file path), or <C-[>? How many people use :wq lol
I just like vim man. You don't have to, that's okay. I like that I can do marks. I love the powerful substitution system. I mean I can just write the signatures of my init functions and automatically create the class variables. Or deal with weird situations like this time some Python code had its documentation above the function and I could just bufdo a string replace to turn those into proper docstrings. I love that I can write macros on the fly, trivially, and can apply them generously. I love registers and how powerful they are. I mean I can write the command I want on a line, push it into a register, and then just call it with @. It's trivial to add to my rc if I like it enough. I love that it's really easy to drop in a local config file that sets the standards for the project I'm working on when it differs from my defaults and I can even share that with everyone! I really like the fact that I can have my editor on just about every nix machine and I don't even need to install it. I can work effectively on a novel machine disconnected from the internet.
I mean my love for vim isn't really just the navigation. But even in the navigation I'm constantly using bindings that most vim plugins don't have. It's not easier for me to use another system and add vim keybindings because that's only a very small portion of vim. I'd rather have all of vim and a fuck ton more of my resources.
I don't think you understand what I mean with the language aware rename. It's not even close to %s. Let's say I've got a c# app and I rename a class in VS. This will rename the class in the file, all class usages (but not as text - if I rename A.B, then it will not touch X.B due to different namespaces), including other projects in the solution, optionally will rename the file it lives in and optionally will/won't replace the text in comments. All listed for review and approval and I don't have to have any of those files open ahead of time.
This is something that the LSP provides (even in VScode), and is available in nvim yes. The command is vim.lsp.buf.rename(), and it is bound to "grn" by default.
https://neovim.io/doc/user/lsp.html#vim.lsp.buf.rename() (HN seems to not link the () part, you have to add it yourself)
All the other similar fanciness like renaming a file and automatically updating module references is also provided by the LSP, and is also available in nvim.
That's cool. This was not easily accessible last time I used plain nvim. I'm glad the functionality has made its way there.
The "integrated" part. I've written some here https://news.ycombinator.com/item?id=42871586
gcc/as/ld are batch processors from the GNU toolchain that offer few (if any) features beyond basic C/C++ (and a handful of other languages) as support, and they're non-standard toolchains on 2 out of 3 major operating systems requiring a bit of heavy lifting to use.
It's kind of nonsense to bring them up in this conversation.
And electron is standard tooling? LOL
Toolchains have little to do with IDEs
I install vscode from scratch, install a few extensions I need, set 3 or 4 settings I use regularly, and bang in 5 minutes I have a customized, working environment catered for almost any language.
vi? Good luck with that.
And I say that as an experienced vim user who used to tinker a bit.
Hell, I will even feel comfortable in a vi terminal, though that's extremely rare to actually find. Usually vi is just remapped to vim
Edit:
The git folder with *all* my dotfiles (which includes all my notes) is just 3M, so I can take it anywhere. If I install all the plugins and if I install all the vim plugins I currently have (which some are old and I don't use) the total is ~100M. So...
3MB of configuration to make a text editor usable is a lot.
3MB? Good god, that'd be ridiculous!
You misread. I'm using 74K for *vim* configs. (Mostly because I have a few files for organization's sake)
I rounded up to 3M from 2.3M and 1.4M of that is .git lol. 156K is all my rc files, another 124K for anything that goes into ~/.confg, 212K for my notes, 128K for install scripts, 108K for templates, and 108K for scripts
I'll repeat myself, with the *same emphasis* as above. Hopefully it's clearer this time.
I was just saying it's pretty simple to carry *everything* around, implying that this is nothing in comparison to something like a plugin or even VScode itself. I mean I went to the VScode plugin page and a lot of these plugins are huge. *All* of the plugins I have *combined* (including unused) are 78M. The top two most installed VSC plugins are over 50M. Hell, the ssh plugin is 28M! I don't have a single plugin that big!props to OP for the screenshots and payloads—that’s how you do it. If any IDE wants trust, they know the recipe - make telemetry optin by default and provide a real kill switch.
Any proof of them not letting you speak because you mentioned “tracking”? This is weird.
I’m in Trae’s Discord (recently decided to try their Pro) and don’t see any regulation. There are multiple users discussing about privacy mode or tos but didn’t see any pushback.
How is memory usage related to telemetry? I don’t see a lot of direct correlation here. If they cut down their mem usage to the same level of cursor did, does that mean they are not sending that much data then?
pretty much every serious dev tool collects some form of usage data. I don’t think this is some evil conspiracy. it’s how teams figure out what’s working, what’s broken, and what needs improving.
Gotcha, blacklisting ByteDance.
So is there any difference between this fork's telemetry and Microsoft's telemetry? Aside from your data being exfiltrated to a different server...
Now imagine we trust this company with collecting psychological profiles of future western politicians ...
And future members of the military, the media, CEOs, etc…
you updated the GitHub repo multiple times, and from the looks of it, that so-called “official censorship” was actually just an automod mute. tying that to telemetry feels like you’re trying to stitch a story together just to sell a narrative
It’s honestly kind of funny — you got timed out by Discord’s automod and now you’re running around calling it “suppression of technical discussion.” lol. Don't be dramatic. There's a wave of crypto spam before, so the automod timeout often catches lots of bots and people. I remember even community mods have been timed out before.
really? I saw him added later today that he was muted by AutoMod. Wondering if the whole “censorship” thing is true
That’s fake news. If you check the GitHub issue and actually visit the Trae community, you’ll see the truth.
The community mod already told him the real reason — it was triggered by the keyword “token”, which has been flagged due to repeated crypto bot spam in the past. But instead, he deliberately claimed it was because of the word “track,” and framed a basic anti-spam automod as “censorship” and “punishment.”
https://github.com/segmentationf4u1t/trae_telemetry_research...
It honestly feels like an attention grab. That kind of intentional misrepresentation is pretty dishonest. And if you check the message history, he was actively chatting in the group before — no one was silencing him.
This software should be added to malware hash lists.
It's interesting that anyone is surprised by this.
Honestly great to see this. This is the power that FB/Microsoft/Google have if they ever decided to take the gloves off. Maybe this will be the motivating factor to get some privacy laws with fangs.
If you continue to send telemetry after I explicitly opt-out, then I get to sue (or atleast get a cut of the fines for whistleblowing)
Why isn't there a decently done code editor with VSCode level features but none of the spyware garbage?
Any recommendations?
This seems like an easy win for a software project
I'm eying Zed. Unfortunately I am dependent on a VS Code extension for a web framework I use. VS Code might have gotten to a critical level of network effect with their extensions, which might make it extremely sticky.
> Why isn't there a decently done code editor with VSCode level features but none of the spyware garbage?
JetBrains products. Can work fully offline and they don't send "telemetry" if you're a paying user: https://www.jetbrains.com/help/clion/settings-usage-statisti...
Why isn't there a decently done code editor with VSCode level features but none of the spyware garbage?
Isn't that what VS Codium is for?
VSCodium I think was abandoned. It was extremely buggy last time I used it.
Either way it uses electron. Which I hate so much.
It's not abandoned https://github.com/VSCodium/vscodium/commits/master/
It seems actively maintained, latest build on Flathub is from 3 hours ago.
VSCodium I think was abandoned.
Sad to hear that. I really enjoyed VS Codium before I jumped full-time into Nova.
(Unsolicited plug: If you're looking for a Mac-native IDE, and your needs aren't too out-of-the-ordinary, Nova is worth a try. If nothing else, it's almost as fast as a TUI, and the price is fair.)
> Why isn't there a decently done code editor with VSCode level features but none of the spyware garbage?
Because no other company was willing to spend enough money to reach critical mass other than Microsoft. VSCode became the dominant share of practically every language that it supported within 12-18 months of introduction.
This then allowed things like the Language Server Protocol which only exists because Microsoft reached critical mass and could cram it down everybody's throat.
well, to be fair, LSP is good gift.
Except that the LSP is now trapped in amber and cannot be evolved because it requires the agreement of Microsoft to change.
Because telemtry is how you effectively make a decently done editor. If you don't have telemtry you will be likely lower quality and will be copying from other editors who are able to effectively build what users want.
emacs.
Because someone has to fund it?
Microsoft is content with funding it, the price is your telemetry (for now).
For high quality development tools I use true FOSS; or I pay for my tools to avoid not knowing where the value is being extracted.
> Microsoft is content with funding it, the price is your telemetry (for now).
The price of VSCode is halo effect for Azure products
The price of VSCode is the lockin on the proprietary extensions.
Specifically: the remote code extension, the C/C++ extension and the Python extension.
I thought with VS Code the price is that it entices you into using Azure, where the enterprise big bucks are made.
Theia IDE seems to meet those requirements.
So... there is a reason VSCode is popular
Emacs still exists
Emacs is great, sure, but it lacks a decent text editor.
proof of emacs excellence.
what other software packages have 200 year old jokes about them?
Vi would, if the users could figure out how to quit out to save the jokes.
So does vi.
As does Neovim
bytedance, … stay far away.
Great work, glad to read about it in the Discord before u posted it here.
Come on over to neovim, the water is fine. Start with lazyvim if you like.
Hi HN, I was evaluating IDEs for a personal project and decided to test Trae, ByteDance's fork of VSCode. I immediately noticed some significant performance and privacy issues that I felt were worth sharing. I've written up a full analysis with screenshots, network logs, and data payloads in the linked post.
Here are the key findings:
1. Extreme Resource Consumption: Out of the box, Trae used 6.3x more RAM (~5.7 GB) and spawned 3.7x more processes (33 total) than a standard VSCode setup with the same project open. The team has since made improvements, but it's still significantly heavier.
2. Telemetry Opt-Out Doesn't Work (It Makes It Worse): I found Trae was constantly sending data to ByteDance servers (byteoversea.com). I went into the settings and disabled all telemetry. To my surprise, this didn't stop the traffic. In fact, it increased the frequency of batch data collection. The telemetry "off" switch appears to be purely cosmetic.
3. What's Being Sent: Even with telemetry "disabled," Trae sends detailed payloads including: Hardware specs (CPU, memory, etc.) Persistent user, device, and machine IDs OS version, app language, user name Granular usage data like time-on-ide, window focus state, and active file types.
4. Community Censorship: When I tried to discuss these findings on their official Discord, my posts were deleted and my account was muted for 7 days. It seems words like "track" trigger an automated gag rule, which prevents any real discussion about privacy.
I believe developers should be aware of this behavior. The combination of resource drain, non-functional privacy settings, and censorship of technical feedback is a major red flag. The full, detailed analysis with all the evidence (process lists, Fiddler captures, JSON payloads, and screenshots of the Discord moderation) is available at the link. Happy to answer any questions.
VSCode is extremely unsafe and you should only use it in a managed, corporate environment where breaches aren't your problem. This goes with any fork, as well.
If you signed a Nondisclosure agreement with your employer, and you use—without approval—a tool that sends telemetry, you may be liable for a breach of the NDA.
I've never seen an NDA that would have clauses like that, and every job I've had required signing one. Do you have any examples?
Exactly, which is why you're using the tools provided for you in a managed corporate environment. :/
Unsafe because of telemetry or unsafe because of the plugin ecosystem?
Plugins and architecture
What should I be reading to know more about this? I am considering a move from Jetbrain's products to VS Code.
I tried this move once and lasted three days.
Opening IDEA after those three days was the same kind of feeling I imagine you’d get when you take off a too tight pair of shoes you’ve been trying to run a marathon in.
ymmv, of course, but for $dayjob I can’t even be arsed trying anything else at this point, it’s so ingrained I doubt it’ll be worth the effort switching.
If you are US and trust US products then consider moving away from Jetbrains to VS Code
I see a lot of confused comments blaming Microsoft, so to clarify: This analysis is about TRAE, a ByteDance IDE that was forked from VSCode: https://www.trae.ai/
Arguably it was Microsoft who started the whole trend of calling spyware "telemetry" to obfuscate what they were doing.
I can't prove it, but I think that's untrue. Anecdotally, I've only heard MS using it in the last 10 years or so, and it's been pretty common terminology for years before that.
Last 10 years is right - Windows 10 was when they went all-in, and that was released in 2015. Before that, "telemetry" usually referred to situations where the same entity owned both ends of the data collection, so "consent" wasn't even necessary.
Microsoft caught flack for backporting telemetry to Windows 7 in the Windows 8/8.1 era. They really started sucking down data in Windows 10 but their spying started years before that.
Obfuscate? Telemetry arguably helps with deobfuscation.
> Telemetry arguably helps with deobfuscation.
Can you please expand on that? I have trouble understanding how telemetry helps me, as a user of the product, understand how the product works.
Yeah. One of the most frustrating things about modern gaming is companies collecting metrics about how their game is played, then publishing "X players did Y!" pages. They're always interesting, but.... why can't I see those stats for my own games?! Looking at you, Doom Eternal and BG3.
You can capture the telemetry data with a HTTPS MITM and read it yourself.
Or (if you're working lower level) you can see an obfuscated function is emitting telemetry, saying "User did X", then you can understand that the function is doing X.
> You can capture the telemetry data with a HTTPS MITM and read it yourself.
That's not helping me, the user.
That's helping me, the developer.
> Or (if you're working lower level) you can see an obfuscated function is emitting telemetry, saying "User did X", then you can understand that the function is doing X.
Again, it helps me, the developer.
Neither of these help me, the user.
Is it just me or does the formatting of this feel like ChatGPT (numbered lists, "Key Takeaways", and just the general phrasing of things)? It's not necessarily an issue if you checked over it properly but if you did use it then it might be good to mention that for transparency, because people can tell anyway and it might feel slightly otherwise
(or maybe you just have a similar writing style)
Yea, the core was written by me, i just used llm to fix my broken english.
Don't pay any attention to people giving you shit for using translation software. A lot of us sometimes forget that the whole world knows a little English and most of us native speakers have a ridiculous luxury in getting away with being two lazy to learn a few other languages.
I think it's good form to mention it as a little disclaimer, just so people don't take it the wrong day. Just write (this post was originally written by me but formatted and corrected with LLM since English is not my primary language).
From what I've seen, people generally do not like reading a generated content, but every time I've seen the author come back and say "I used it because it isn't my main language" the community always takes back the criticism. So I'd just be upfront about it and get ahead of it.
That was already added before this reply.
> using translation software
It's clear that this isn't what OP was doing. The LLM was writing, not merely translating. dang put it well:
> we want people to speak in their own voice
https://news.ycombinator.com/item?id=44704054
Part of the problems with using LLMs for translation is precisely that they alter the tone and structure of what you give it, writing using the LLM cliches and style, and it's unsurprising people see that and just assume completely generated slop. It's unfortunate, and I would probably try and use LLMs if English wasn't my first language, but I don't think it's as simple as "using translation software", I've not seen people called out in that way for dodgy Google Translate translations, for example, it's a problem specific to LLMs and the output they make having fundamental issues.
LLM writing style does to the brain what Microsoft Sam does to the ears.
my nipples explode with delight!
I wasn't annoyed about it, I just said it might be good to mention because people will notice anyway, and at this point there's enough AI slop around that it can make people automatically ignore it so it would be good to explain that. I'm surprised I got downvotes and pushback for this, I thought it was a common view that it's good to disclose this kind of thing and I thought I was polite about it
To be clear I think this has good information and I upvoted it, it’s just that as someone else said it’s good to get ahead of anyone who won’t like it by explaining why and also it can feel a little disingenuous otherwise (I don’t like getting other people to phrase things for me either for this reason but maybe that’s just me)
God forbid people actually learn the language they're trying to communicate in. I'd much rather read someone's earnest but broken English than LLM slop anyway.
It's disingenuous to call LLMs "translation software", and it's bad advice to say "don't pay attention those people".
Even if you don't agree with it, publishing AI generated content will exclude from ones audience the people who won't read AI generated content. It is a tradeoff one has to decide whether or not to make.
I'm sympathetic to someone who has to decide whether to publish in 'broken english' or to run it through the latest in grammar software. For my time, I far prefer the former (and have been consuming "broken english" for a long while, it's one of the beautiful things about the internet!)
Your content is great, and the participation of non-native English speakers in this community makes it better and richer.
I'd rather you write in broken English than filter it through an LLM. At least that way I know I'm reading the thoughts of a real human rather than something that may have its meaning slightly perturbed.
> might be good to mention that for transparency, because people can tell anyway and it might feel slightly otherwise
Devil's advocate: why does it matter (apart from "it feels wrong")? As long as the conclusions are sound, why is it relevant whether AI helped with the writing of the report?
It is relevant because it wastes time and adds nothing of substance. An AI can only output as much information as was inputted into it. Using it to write a text then just makes it unnecessarily more verbose.
The last few sections could have been cut entirely and nothing would have been lost.
Edit: In the process of writing this comment, the author removed 2 sections (and added an LLM acknowledgement), of which I referred to in my previous statement. To the author, thank you for reducing the verbosity with that.
AI-generated content is rarely published with the intention of being informative. * Something being apparently AI-generated is a strong heuristic that something isn't worth reading.
We've been reading highly-informative articles with "bad English" for decades. It's okay and good to write in English without perfect mastery of the language. I'd rather read the source, rather than the output of a txt2txt model.
* edit -- I want to clarify, I don't mean to imply that the author has ill will or intent to misinform. Rather, I intend to describe the pitfalls of using an LLM to adapt ones text, inadvertently adding a very strong flavor of spam to something that is not spam.
True, but there are many more people that speak no English, or so badly that an article would be hard to understand. I face this problem now with the classes I teach. It's an electronics lab for physics majors. They have to write reports about the experiments they are doing. For a large fraction, this task is extraordinary hard not because of the physics, but because of writing in English. So for those, LLMs can be a gift from heaven. On the other hand, how do I make sure that the text is not fully LLM generated? If anyone has ideas, I'm all ears.
I don't have any ideas to help you there. I was a TA in a university, but that was before ChatGPT, and it was an expectation to provide answers in English. For non-native English speakers, one of the big reasons to attend an English-speaking university was to get the experience in speaking and reading English.
But I also think it's a different thing entirely. It's different being the sole reader of text produced by your students (with responsibility to read the text) compared to being someone using the internet choosing what to read.
Because AI use is often a strong indicator of a lack of soundness. Especially if it's used to the point where its structural quirks (like a love for lists) shine through.
I just wanna read stuff written by people and not bots
simple as
Because AI isn't so hot on the "I" yet, and if you ask it to generate this kind of document it might just make stuff up. And there is too much content on the internet to delve deep on whatever you come across to understand the soundness of it. Obviously you need to do it at some point with some things, but few people do it all the time with everything.
Pretty much everyone has heuristics for content that feels like low quality garbage, and currently seeing the hallmarks of AI seems like a mostly reasonable one. Other heuristics are content filled with marketing speak, tons of typos, whatever.
> As long as the conclusions are sound
I can't decide to read something because the conclusions are sound. I have to read the entire thing to find out if the conclusions are sound. What's more, if it's an LLM, it's going to try its gradient-following best to make unsound reasoning seem sound. I have to be an expert to tell that it is a moron.
I can't put that kind of work into every piece of worthless slop on the internet. If an LLM says something interesting, I'm sure a human will tell me about it.
The reason people are smelling LLMs everywhere is because LLMs are low-signal, high-effort. The disappointment one feels when a model starts going off the rails is conditioning people to detect and be repulsed by even the slightest whiff of a robotic word choice.
edit: I feel like we discovered the direction in which AGI lies but we don't have the math to make it converge, so every AI we make goes completely insane after being asked three to five questions. So we've created architectures where models keep copious notes about what they're doing, and we carefully watch them to see if they've gone insane yet. When they inevitably do, we quickly kill them, create a new one from scratch, and feed it the notes the old one left. AI slop reads like a dozen cycles of that. A group effort, created by a series of new hires, silently killed after a single interaction with the work.
I want this to be the plot of bladerunner - deckard must hunt down errant replicants before they completely go insane due to context limits
Because it helps me decide if I should skim through or actually read it
Theory: Using AI and having an AI voice makes it less likely the conclusions are sound.
Looks like I missed a word here (probably “disingenuous”)
> As long as the conclusions are sound, why is it relevant whether AI helped with the writing of the report?
TL;DR: Because of the bullshit asymmetry principle. Maybe the conclusions below are sound, have a read and try to wade through ;-)
Let us address the underlying assumptions and implications in the argument that the provenance of a report, specifically whether it was written with the assistance of AI, should not matter as long as the conclusions are sound.
This position, while intuitively appealing in its focus on the end result, overlooks several important dimensions of communication, trust, and epistemic responsibility. The process by which information is generated is not merely a trivial detail, it is a critical component of how that information is evaluated, contextualized, and ultimately trusted by its audience. The notion that it feels wrong is not simply a matter of subjective discomfort, but often reflects deeper concerns about transparency, accountability, and the potential for subtle biases or errors introduced by automated systems.
In academic, journalistic, and technical contexts, the methodology is often as important as the findings themselves. If a report is generated or heavily assisted by AI, it may inherit certain limitations, such as a lack of domain-specific nuance, the potential for hallucinated facts, or the unintentional propagation of biases present in the training data. Disclosing the use of AI is not about stigmatizing the tool, but about providing the audience with the necessary context to critically assess the reliability and limitations of the information presented. This is especially pertinent in environments where accuracy and trust are paramount, and where the audience may need to know whether to apply additional scrutiny or verification.
Transparency about the use of AI is a matter of intellectual honesty and respect for the audience. When readers are aware of the tools and processes behind a piece of writing, they are better equipped to interpret its strengths and weaknesses. Concealing or omitting this information, even unintentionally, can erode trust if it is later discovered, leading to skepticism not just about the specific report, but about the integrity of the author or institution as a whole.
This is not a hypothetical concern, there are numerous documented cases (eg in legal filings https://www.damiencharlotin.com/hallucinations/) where lack of disclosure about AI involvement has led to public backlash or diminished credibility. Thus, the call for transparency is not a pedantic demand, but a practical safeguard for maintaining trust in an era where the boundaries between human and machine-generated content are increasingly blurred.
who care? its like using a spell checker. why does it matter?
yeah man, next time VSCode crashes and recovers your unsaved work, just remember: it knows way too much about you.
this kind of overreaction is exactly why real privacy concerns get ignored. it misses the point and just ends up misleading devs who actually care about meaningful issues.