Copilot Chat in VS Code is now open source

188 points | by ulugbekna 4 days ago

80 comments

gatienboquet 4 days ago
Here's the system Prompt Template they use : https://github.com/microsoft/vscode-copilot-chat/blob/4c72d6...
[-]
- rob-olmos 4 days ago
  "- cursor position marked as ${CURSOR_TAG}: Indicates where the developer's cursor is currently located, which can be crucial for understanding what part of the code they are focusing on."
  I was not aware that was a thing and useful to know. Thanks!
  [-]
  - mathgeek 4 days ago
    Interesting to hear how others use these tools. I often phrase things as “this line/method” which implies the tool knows where my cursor is.
    [-]
    - al_borland 4 days ago
      I use the in-line prompt when I’m talking about a specific area. In the chat I always explained in words what part of the code I’m talking about. This tidbit of information will change how I use chat.
  - blitzar 4 days ago
    Isn't that needed for the tab completion?
    [-]
    - ‍ 4 days ago
      [deleted]
  - BigGreenJorts 4 days ago
    I assumed they had this already, but I began to suspect it didn't actually exist. Disappointed to learn I was right bc half the time copilot pretends it can't read my code and at all asks me to go look for stuff in the code.
user4673568345 4 days ago
Copilot in vs code is kind of lackluster and really missing the sort of polish you’d expect from a company like Microsoft
[-]
- idrios 20 hours ago
  Not sure what polish you're looking for but it's been great for me. Smart autocomplete algorithm and good chatbot that acts like a personalized Google / stack overflow. It's not good enough to vibe code full products or debug hard issues, but that's just limitations of AI.
  Also it's all Microsoft, they made both the AI and the editor. I appreciate that they don't do anything extra in the editor to favor their own extensions.
- jacooper 4 days ago
  Even after agent mode was added? My experience with it has been great, haven't tried gemini CLI or cline yet but I don't think they are going to be much better
  [-]
  - BugsJustFindMe 3 days ago
    My repeated experience with agent mode in vscode is that it constantly claims to have made changes that it never made, and no amount of prompting and reminding and nagging seems to fix it.
    [-]
    - jacooper 2 days ago
      With which model? It's very reliable for me.
      [-]
      - BugsJustFindMe 2 days ago
        It seems implausible that this would be a model issue vs a local software failing to edit documents issue.
- DidYaWipe 4 days ago
  USED to expect from Microsoft.
  Have you even used any of their products lately? Where "lately" = the last 15 years...
  [-]
  - koolala 4 days ago
    Now I expect out-right malicious corruption.
    [-]
    - Smar 4 days ago
      That sounds about right.
  - TiredOfLife 3 days ago
    Where "lately" = the last 50 years
hu3 4 days ago
Quick, someone use AI to scan the codebase and explain the decision tree of Copilot Chat with regards how it handle prompts and responses.
[-]
- dataviz1000 4 days ago
  I very much need to know this also. First, tools [0] and prompts [1]. I'll get back to you in a minute while I back trace the calling path. One thing to note is that they use .tsx for rendering the prompts and tool responses.
  1. User selects ask or edit and AskAgentIntent.handleRequest or EditAgentIntent.handleRequest is called on character return.
  2. DefaultIntentRequestHandler.getResult() -> createInstance(AskAgentIntentInvocation) -> getResult -> intent.invoke -> runWithToolCalling(intentInvocation) -> createInstance(DefaultToolCallingLoop) -> loop.onDidReceiveResponse -> emit _onDidReceiveResponse -> loop.run(this.stream, pauseCtrl) -> runOne() -> getAvailableTools -> createPromptContext -> buildPrompt2 -> buildPrompt -> [somewhere in here the correct tool gets called] -> responseProcessor.processResponse -> doProcessResponse -> applyDelta ->
  [0] https://github.com/microsoft/vscode-copilot-chat/blob/main/s...
  [1] https://github.com/microsoft/vscode-copilot-chat/blob/main/s...
  [2] src/extension/intents/node/toolCallingLoop.ts
  [-]
  - chatmasta 4 days ago
    Something I’ve wanted to hack together for a while is a custom react-renderer and react-reconciler for prompt templating so that you can write prompts with JSX.
    I haven’t really thought about it beyond “JSX is a templating language and templating helps with prompt building and declarative is better than spaghetti code like LangChain.” But there’s probably some kernel of coolness there.
    [-]
    - dataviz1000 3 days ago
      VSCode uses @vscode/prompt-tsx [0]
      They also provide documentation for all this. [1]
      VSCode also provides examples. [2]
      [0] https://github.com/microsoft/vscode-prompt-tsx
      [1] https://code.visualstudio.com/api/extension-guides/chat
      [2] https://github.com/microsoft/vscode-extension-samples/blob/m...
    - sqs 4 days ago
      Priompt might be what you're looking for: https://github.com/anysphere/priompt.
  - rightbyte 4 days ago
    >> Quick
    > in a minute
    Honestly. Why the hurry?
    [-]
    - ‍ 3 days ago
      [deleted]
  - _boffin_ 4 days ago
    Care to also check if they do prompt decomposition into multiple prompts?
    [-]
    - dataviz1000 4 days ago
      You're asking if they break the user prompt into multiple chunks?
      All I can find is counting number of tokens and trimming to make sure the current turn conversation fits. I can not find any chunking logic to make multiple requests. This logic exists in the classes that extend IIntentInvocation which as buildPrompt() method.
      [-]
      - _boffin_ 4 days ago
        I believe it's this paper, but... not certain: https://arxiv.org/abs/2210.02406
        will update when i find more info.
        [-]
        ‍ 3 days ago
        [deleted]
- ‍ 4 days ago
  [deleted]
revskill 4 days ago
I wont trust microsoft nor google until the end if universe.
blufish 3 days ago
microsoft now is ibm then
v5v3 4 days ago
[flagged]
Ismoil 4 days ago
[flagged]
treymov 4 days ago
[flagged]
NYhacker 3 days ago
Hi
BrentOzar 4 days ago
I have a hard time getting excited about this when they have such an atrocious record of handling pull requests in VS Code already: https://github.com/microsoft/vscode/pulls
[-]
- msgodel 4 days ago
  It looks to me like they close nearly 30 PRs every day. That's kind of amazing.
  I'm no fan of Microsoft but that's a massive maintenance burden. They must have multiple people working on this full time.
  [-]
  - Alupis 4 days ago
    If you examine the merged PR's - the overwhelming majority are from Microsoft employees. Meanwhile, community contributions sit and rot.
    [-]
    - jemiluv8 4 days ago
      I thought they just open sourced this? Was there enough time to start reviewing community contributions?
    - esperent 3 days ago
      How would I be able to examine the PRs to verify this?
      [-]
      - msgodel 3 days ago
        There aren't that many Microsoft employees, it took me a couple minutes to memorize the team.
        Of course the majority are from Microsoft, they do seem to merge in a fair amount of PRs from the community though.
      - lozenge 3 days ago
        Look at the first comment in the PR, it will have a badge "This user is a member of the Microsoft organization". Alternatively, look at the release notes on the website, any non-Microsoft contributions are listed at the bottom.
        [-]
        esperent 3 days ago
        Right, but how to do that for thousands of PRs to see that there's a bias? I assume it's a ton of work.
        [-]
        xyzzy123 3 days ago
        Why not sample 20 and see if you can spot a trend?
        [-]
        esperent 3 days ago
        Because I studied statistics.
        [-]
        xyzzy123 3 days ago
        You can get the JSON from like,
        gh pr list --repo microsoft/vscode --state merged --limit 1000 --json author,mergedAt,title
        Then you can do:
        jq -r '.[] | [.author.login, .author.name] | @tsv' 1kprs.json | sort | uniq -c | sort -nr
        And see there's only 63 authors and > 90% of the merged PRs are from Microsoft (which.. fair, it's their product).
        I think the signal is strong enough that you can legitimately reach the same conclusion by mk 1 eyeball.
        NOTE: I'm not criticising, it's a Microsoft-driven project and I am fine with that. The _do_ merge things from "random" contributors (yay) but they are overwhelmingly things that a) fix a bug while b) being less than 5 lines of code. If that is your desired contribution then things will go well for you and arguably they do well at accepting such. But they are very unlikely to accept a complex or interesting new feature from an outsider. All of this is seems totally reasonable and expected.
  - DidYaWipe 4 days ago
    Yet the Settings UI is still a nonsensical mess.
    [-]
    - thiht 3 days ago
      How is it a nonsensical mess? It’s clean, searchable, allows to fallback to json..
      I get it might not be perfect, but "nonsensical mess" is maybe in bad faith here.
      [-]
      - DidYaWipe 3 days ago
        Valid question. There's some area where the text of the setting's state is the opposite of what the checkbox shows. I'll try to dig up a screen shot.
        Historically, setting syntax colors has sucked too, but I don't remember the current state of that.
- Alupis 4 days ago
  That's because it's Microsoft's Trademarked version of Open Source.
  All the good FOSS vibes, without any of the hard FOSS work...
  [-]
  - bmitc 4 days ago
    Out of Apple, Google, and Microsoft, Microsoft is _by far_ the most active and open to open source and contributions.
  - NewsaHackO 4 days ago
    I hate this analogy. Just because something is open source, doesn’t mean it is forced to commit or comment on every pull request which takes development time. If that notion really bothers you, you are free to fork VSCode and close all 600 pull requests on your fork.
    [-]
    - jemiluv8 4 days ago
      Agree. OSS is hard work and not obligatory.
    - ‍ 4 days ago
      [deleted]
    - almosthere 4 days ago
      [flagged]
    - Alupis 4 days ago
      It's a common theme across most (all?) Microsoft "Open Source" repos. They publish the codebase on Github (which implies a certain thing on it's own), but accept very little community input/contributions - if any.
      These repo's will usually have half a dozen or more Microsoft Employees with "Project Manager" titles and the like - extremely "top heavy". All development, decision making, roadmap and more are done behind closed doors. PR's go dormant for months or years... Issues get some sort of cursory "thanks for the input" response from a PM... then crickets.
      I'm not arguing all open source needs to be a community and accept contributions. But let's be honest - this is deliberate on Microsoft's part. They want the "good vibes" of being open source friendly - but corporate Microsoft still isn't ready to embrace open source. ie, it's fake open source.
      [-]
      - cyral 4 days ago
        I've looked at a bunch of the popular JS libraries I depend on and they are all the same story, hundreds of open PRs. I think it's just difficult to review work from random people who may not be implementing changes the right way at all. Same with the project direction/roadmap, I'd say the majority of open source repos are like that. People will suggest ideas/direction all day and you can't listen to everyone.
        Not sure for VSCode, but for .NET 9 they claim: "There were over 26,000 contributions from over 9,000 community members! "
      - almosthere 4 days ago
        f. o. r. k. everything costs money, waaaay more than a $5 buy me a coffee. Every PR MS closes costs them thousands of dollars.
      - __jonas 4 days ago
        Why is that bad? Seems like a perfectly valid approach for an Open Source project, SQLite is doing the same.
- tomnipotent 4 days ago
  I'm not sure I see the problem. The number of merged PR's looks on the high side for a FOSS project.
  https://github.com/microsoft/vscode/pulls?q=is%3Apr+is%3Aclo...
- lozenge 3 days ago
  I've had a lot of PRs merged. If you don't create an issue or the issue already says it doesn't suit their vision then it won't get merged. It also helps to update the PR in November/December, even if there are no merge conflicts, as that's when they "clean up" and try to close as many as possible.
mirekrusin 4 days ago
There are just two forms of code - public domain and private. It's just that some people don't see it yet.
xenophonf 4 days ago
What is Copilot Chat but a front end to some Microsoft SaaS offering? There's nothing materially "open source" about that. All the important stuff is locked up behind the GitHub Copilot API. No one can customize the LLM design or training material. It certainly can't be self-hosted. This is just in-app advertising for yet another subscription service that sends your personal data to an amoral third party. There's no community, no public benefit, no commonwealth.
[-]
- phillipcarter 4 days ago
  I beg to differ. All commercial SOTA models emit roughly the same quality of code and have roughly the same limitations and ability to remain coherent in the size of context passed to them.
  As has always been the case, it's the mechanisms used to feed relevant contextual information and process results that sets one tool apart from another. Everyone can code up a small agent that calls in LLM in a loop and passes in file contents. As I'm sure you've noticed, this alone does not make for a good coding agent.
- jemiluv8 4 days ago
  I don't follow the criticism. It is built on very weak foundations. Open source is just that - open source. Whether it is useful to you or anyone at all is another matter.
  [-]
  - unethical_ban 4 days ago
    It's an open source... API connector to a closed source product.
    "Copilot chat" isn't open source. It's the service.
    [-]
    - tiahura 4 days ago
      It’s really quite important. It takes the requests from the user and gives it to the llm to process. It’s got people skills!
  - tomalbrc 4 days ago
    It's white-washing through "Open Source". No one will benefit from this
    [-]
    - jemiluv8 4 days ago
      Yet here we are, it is out there, some are already poking at how they render responses from their api. Trying to understand some of the technical choices they had to make. Someone has probably cloned this and started pluggin in their own api - or reverse engineering the various api calls.
      In the end, the fact that it exists makes a difference. It won't be useful to all especially non-technical people who've never seen the nuts and bolts of a vscode extension.
    - lvturner 4 days ago
      Microsoft will, and won't this also help devalue a lot of smaller players in a similar area?
- MangoCoffee 4 days ago
  Doesn't open source mean users get the source code?
  I don't understand this criticism.
  [-]
  - senko 4 days ago
    They get the source code to a client.
    The criticism is that most of the value is (presumably) on the API service side.
    https://gwern.net/complement
    [-]
    - jemiluv8 4 days ago
      That is why people are comfortable open sourcing things like this. It is good publicity and they don't loose anything. On the other hand curious devs get to poke around and wonder how their copilot prompts were processed by the plugin. Or how it handles attaching files to context. And even what it sends in its payloads.
      Of course most of the value is on the API service side. That holds true for most applications these days.
  - grg0 4 days ago
    No, that's source available. See the OSI definition for what 'open source' means. And this is precisely the issue with 'open source' vs 'free software'. Once you rewire your brain for the latter, it's very obvious why a project like this is simply open-washing for PR points.
- dawnofdusk 4 days ago
  I mean you're right it's just a front end. And front ends can be open sourced? Obviously this has some public value: other people don't have to build a frontend starting from zero.
  I don't think it's well-aimed criticism to say that the LLM design/training material itself should have been made open source. Pretty much no one in the open source community would have the computational resources to actually do anything with this...
  [-]
  - brahma-dev 4 days ago
    But they might have the computational resource to showcase how these companies are breaking the copyright law that they loved until recently.
  - ‍ 4 days ago
    [deleted]
  - ‍ 4 days ago
    [deleted]
  - jemiluv8 4 days ago
    They are not obligated to provide it even if people have the computational resources to operationalise it.
- 1vuio0pswjnm7 4 days ago
  I think this is at least the third comment I have seen recently complaining about "AI" APIs. As a non-developer, this is difficult for me to understand.
  I have had what appears to be the same, or similar, complaint against "Web APIs" for many years when trying to access public iinformation. Not that long ago, websites did not have "APIs". Generally, I still extract information from the public websites rather than use "APIs". This requires no sign up and is guaranteed to work as long as the website exists.
  Before "Web APIs", websites did not routinely collect email addresses, track usage, rate limit individuals and charge subscription fees in exchange for access to public information.
  Sometimes the "Web APIs" are free but "sign up" is always required. There is email address collection, usage is tracked, "accounts" are rate limited. In the past, some HN commenters also complained that these "APIs" are unreliable in the sense that they can be "discontinued" suddenly for any reason. Anyone depending on them has no recourse.
  These "Web APIs" became so common that their rationale went unquestioned. For example, why not let www users download bulk data. In rare cases, this is an option, e.g., some government websites, Wikipedia dumps, etc. But this is the exception not the norm.
  In light of these comments complaining about accessing LLMs through Web APIs I am wondering:
  Are "Web APIs", "SaaS", etc. now being turned against the people who concertedly promoted these tactics, namely, software developers.
  I always saw "APIs" as an easy way website operators could deny access to www users. Like some requirement to have an "account" in order to access public information. Those providing these "APIs" have the data in a format they could provide for download, e.g., JSON, but bulk downloads are not provided. This tactic of charging fees is quite different from how websites operated for the first decades I used the www.
  Similarly, LLM providers have details about the models, e.g., weights, etc., in formats they could provide for download. Instead, users are encouraged to "sign up", "subscribe", or whatever, for access to public information^1 through an "API".
  1. Assuming the provider obtained the training material from the www, a public resource.