> There is an increasing crowd of people who ask a large language model to "find a problem in curl, make it sound terrible", then send the result, which is never correct, to the project, thinking that they are somehow helping.
I already have some processes at work that are reviewed by AI only. Which means we are advised to use another AI to fill out the data quicker.
It's nothing critical, but still both scary and hilarious at the same time. Shit on the input, shit on the output - nothing new, just fancier tools.
Asimov's vision of history so tangled and noisy that no one really knows what is truth and what is a legend is happening in front of our own eyes. It didn't need millennia, just a few years of AI companies abusing our knowledge that was available for anyone for free.
The problem is that open source maintainers rarely react, because most projects are captured by some big tech employees. Independent authors like Stenberg are the exception.
If the rebellious spirit of the 1990s and early 2000s still existed, open source could sink "AI" code laundromats within a month. But since 2010 everyone is falling over themselves to please big tech. Big Tech now rewards the cowards with layoffs and intimidation.
Most developers do not understand that power balances in corporations work on a primal level. If you show fear/submission, managers will treat you like a beta dog. That is all they understand.
From my experience, most businesses (or at least the developers working for them) actually would like to donate or pay for support on the OSS projects they rely on. The problem, at least from my experience, is that it is hard to do so due to legislation, compliance, etc.
Example: I once convinced my employer to donate to some open source projects we relied on. They did, then few months later they got slapped on the wrist by the authorities for not being able to prove where these overseas payments were going to, and that these payments weren't used for funding terrorist activities.
Similarly, I used to contribute to an OSS project, we did get asked by some corps to do paid work like bug fixes or features. The problem was that they required invoices in order for them to be allowed to pay us, so we needed to register as a company, get a tax number, etc. I was a freelancer at the time, so I offered to use my business registration to be able to invoice, then split the profit amongst the contributors. Then the very first paying 'customer' immediately hit us with a 20-page vendor assessment form asking about my SOC2 or ISO27001 certifications, data security policies, background checks of my 'employees' etc. Then I got confronted by my accountant that distributing the payment amongst other people would be seen as disguised wages and could get me into serious legal problems.
Granted, this was some years ago, things have gotten better now with initiatives as Github Sponsors, KoFi and Patreon. But at the same time legislation has gotten more restrictive, doing business with large corps is difficult, expensive and very time consuming. It's not worth it for most OSS maintainers, and similarly it isn't worth the legal headache for the large corps to make these kind of donations.
You can use LLMs as part of the process of identifying bugs, developing features, etc. but you must verify the results. Accepting what the LLM says without testing, checking, and verifying the output is lazy and likely to produce errors, or make the code harder to maintain, e.g. if what the LLM produces isn't in line with the project's development/formatting standards or changes other parts of the code.
Generally speaking, the second you realize a technology/process/anything has a hard requirement that individuals independently exercise responsibility or self-control, with no obvious immediate gain for themselves, it is almost certain that said technology/process/anything is unsalvageable in its current form.
This is in the general case. But with LLMs, the entire selling point is specifically offloading "reasoning" to them. That is quite literally what they are selling you. So with LLMs, you can swap out "almost certain" in the above rule to "absolutely certain without a shadow of a doubt". This isn't even a hypothetical as we have experimental evidence that LLMs cause people to think/reason less. So you are at best already starting at a deficit.
But more importantly, this makes the entire premise of using LLMs make no sense (at least from a marketing perspective). What good is a thinking machine if I need to verify it? Especially when you are telling me that it will be a "super reasoning" machine soon. Do I need a human "super verifier" to match? In fact, that's not even a tomorrow problem, that is a today problem: LLMs are quite literally advertised to me as a "PhD in my pocket". I don't have a PhD. Most people would find the idea of me "verifying the work of human PhDs" to be quite silly, so how does it make any sense that I am in any way qualified to verify my robo-PhD? I pay for it precisely because it knows more than I do! Do I now need to hire a human PhD to verify my robo-PhD?" Short of that, is it the case that only human PhDs are qualified to use robo-PhDs? In other words, should LLms exclusively be used for things the operator already knows how to do? That seems weird. It's like a Magic 8 Ball that only answers questions you already know the answer to. Hilariously, you could even find someone reaching the conclusion of "well, sure, a curl expert should verify the patch I am submitting to curl. That's what submitting the patch accomplishes! The experts who work on curl will verify it! Who better to do it than them?". And now we've come full circle!
To be clear, each of these questions has plenty of counter-points/workarounds/etc. The point is not to present some philosophical gotcha argument against LLM use. The point rather is to demonstrate the fundamental mismatch between the value-proposition of LLMs and their theoretical "correct use", and thus demonstrate why it is astronomically unlikely for them to ever be used correctly.
1. a better autocomplete -- here the LLM models can make mistakes, but on balance I've found this useful, especially when constructing tests, writing output in a structured format, etc.;
2. a better search/query tool -- I've found answers by being able to describe what I'm trying to do where a traditional search I have to know the right keywords to try. I can then go to the documentation or search if I need additional help/information;
3. an assistant to bounce ideas off -- this can be useful when you are not familiar with the APIs or configuration. It still requires testing the code, seeing what works, seeing what doesn't work. Here, I treat it in the same way as reading a blog post on a topic, etc. -- the post may be outdated, may contain issues, or may not be quite what I want. However, it can have enough information for me to get the answer I need -- e.g. a particular method which I can then consult docs (such as documentation comments on the APIs) etc. Or it lets be know what to search on Google, etc..
In other words, I use LLMs as part of the process like with going to a search engine, stackoverflow, etc.
>>Companies tend to assume that somebody else is paying for the development of open-source software, so they do not have to contribute.
I think if you are a billion dollar company using these tools, sponsoring maintenance isn't a lot to ask.
Curiously enough this came up even during the days of Perl.
I don't think Perl got its due, especially given the fact that even until most recently almost everything of importance was done with Perl. Heck internet was made possible because of Perl.
> I think if you are a billion dollar company using these tools, sponsoring maintenance isn't a lot to ask.
It isn't a lot to ask, but it's challenging to 1) find who to ask, and 2) get them to care about the long-term view in a way that doesn't fit into short-term thinking and budgeting.
I've often asked how my company could support them. Most I ask don't understand the question. Those that do only point out that I can contribute code changes - which I have but rarely as we pick good projects that meet our needs: there rarely are bugs or features we would care about enough to not do our regular work.
what would be nice is a non profit that would take money and distribute it to the projects we use - likely with some legal checking that they are legal (whatever that means). FSF is the only one I know of that does generic development and they have ideas that companies generally oppose and so are out
A lot of open source maintainers are bad at asking for money, and most companies find it very hard to give money away without some kind of formal arrangement in place.
Here's a way you can work around that, if you are someone who works for a company with money:
Contact the maintainers of software you use and invite them to speak to your engineering team via Zoom in exchange for a speaking fee.
Your company knows how to pay consultants. It likely also has an existing training budget you can tap into.
You're not asking the maintainer to give a talk - those take time to prepare and require experience with public speaking.
Instead, set it up as a Q&A or a fireside chat. Select someone from your own team who is good at facilitating / asking questions.
Aim for an hour of time. Pay four figures.
Then do the same thing once a month or so for other projects you depend on.
I really like the idea of normalizing companies reaching out to maintainers and offering them a relatively easy hour long remote consultation in exchange for a generous payment. I think this may be a discreet way to help funnel money into the pockets of people who's work a company depends on.
It does have the side effect of wasting the time of 1+n engineers for that hour. I might be able to rustle up a few in month 1, but I'm not going to ba able to do it monthly.
Frankly, as long as the builder has a "support contract" option, that should be sufficient.
I will add that understanding how business works is a huge help to them to get you paid. I advocated for supporting a project (they have a "sponsored by" marketing on their web page, so we could take it out the marketing budget.) But they could only be paid via PayPal (which unfortunately we can't do) do the deal fell through.
It didn't help that the home page in question contained lot of sarcasm, and was antagonistic in tone, likely (I suspect) because of the nonsense the maintainer had to wade through. Ultimately no money got sent.
I'm happy to support OSS, but I can only spend so much social capital on doing so. My advice to maintainers, if you want sponsorship, put some effort into making that channel professional. It really helps.
Many projects have foundations or fiscal sponsors you can work with.
If you care about Python, you could support the Python Foundation, and/or hire or sponsor some Python developers. If you care about Rust, support the Rust Foundation, and/or hire or sponsor some Rust developers. If you care about Reproducible Builds, or QEMU, or Git, or Inkscape, or the future of FOSS licensing, or various other projects (https://sfconservancy.org/projects/current/), support Software Freedom Conservancy.
If you care about a smaller project, and they don't have a means of sponsorship, you could encourage them to accept sponsorship via some means, or join some fiscal sponsor umbrella like Conservancy.
Another such umbrella organization is Software in the Public Interest (SPI). Some of the more notable projects they sponsor include Arch Linux, Debian, FFmpeg, LibreOffice, OpenSSL, OpenZFS, PostgreSQL, and systemd.
It would be cool to build a "library clout" measure for all open source software. First collect for all deployed software systems measures of usage per platform and along other interesting dimensions like how that system relates to others (is it a common dependency or platform for other deployed software). Use this to generate "clout" at a deployed software unit level. Then detect all open source libraries compiled in it by binary signature matching or through the software's own build system if it is open. Then a library's "clout" is built from the clout of the projects that use it.
This clout score might be used to guide investments in a non-profit for funding critical OSS. Data collection would be challenging though, as would callibrating need.
Basically make a rigorous score to track some of the intuition from https://xkcd.com/2347/
There are indeed missing steps and I won't claim to reinvent society in one hn thread, /but/ I am just saying that there needs to be real life consequences for being a shitty student. Is that email coming from EDU?
Submit a complaint to the university. Just make an email template and fight back. It takes a minute to find the student affairs or dean's email. Surely there will be one person in the entire institution who cares.
> There is an increasing crowd of people who ask a large language model to "find a problem in curl, make it sound terrible", then send the result, which is never correct, to the project, thinking that they are somehow helping.
Our worst nightmares are becoming true indeed..
The worst nightmare would be the maintainers in turn use large language model to review or apply these patches
I already have some processes at work that are reviewed by AI only. Which means we are advised to use another AI to fill out the data quicker.
It's nothing critical, but still both scary and hilarious at the same time. Shit on the input, shit on the output - nothing new, just fancier tools.
Asimov's vision of history so tangled and noisy that no one really knows what is truth and what is a legend is happening in front of our own eyes. It didn't need millennia, just a few years of AI companies abusing our knowledge that was available for anyone for free.
and then have another one duke it out with the first one to reject the patch. that would be a nice llm-vs-llm, prompt-fight-prompt :o)
The problem is that open source maintainers rarely react, because most projects are captured by some big tech employees. Independent authors like Stenberg are the exception.
If the rebellious spirit of the 1990s and early 2000s still existed, open source could sink "AI" code laundromats within a month. But since 2010 everyone is falling over themselves to please big tech. Big Tech now rewards the cowards with layoffs and intimidation.
Most developers do not understand that power balances in corporations work on a primal level. If you show fear/submission, managers will treat you like a beta dog. That is all they understand.
This is getting more common. I've seen CVEs posted to several opensource projects that included made-up APIs.
From my experience, most businesses (or at least the developers working for them) actually would like to donate or pay for support on the OSS projects they rely on. The problem, at least from my experience, is that it is hard to do so due to legislation, compliance, etc.
Example: I once convinced my employer to donate to some open source projects we relied on. They did, then few months later they got slapped on the wrist by the authorities for not being able to prove where these overseas payments were going to, and that these payments weren't used for funding terrorist activities.
Similarly, I used to contribute to an OSS project, we did get asked by some corps to do paid work like bug fixes or features. The problem was that they required invoices in order for them to be allowed to pay us, so we needed to register as a company, get a tax number, etc. I was a freelancer at the time, so I offered to use my business registration to be able to invoice, then split the profit amongst the contributors. Then the very first paying 'customer' immediately hit us with a 20-page vendor assessment form asking about my SOC2 or ISO27001 certifications, data security policies, background checks of my 'employees' etc. Then I got confronted by my accountant that distributing the payment amongst other people would be seen as disguised wages and could get me into serious legal problems.
Granted, this was some years ago, things have gotten better now with initiatives as Github Sponsors, KoFi and Patreon. But at the same time legislation has gotten more restrictive, doing business with large corps is difficult, expensive and very time consuming. It's not worth it for most OSS maintainers, and similarly it isn't worth the legal headache for the large corps to make these kind of donations.
The talk that was referred to in the the article can be found here, just 13 minutes:
Keynote: Giants, Standing on the Shoulders Of - Daniel Stenberg, Founder of the Curl Project
https://www.youtube.com/watch?v=YEBBPj7pIKo
While the article does a great job, the video's graphs and photos really bring a lot more depth.
The full story https://archive.fosdem.org/2024/schedule/event/fosdem-2024-1...
The Sovereign Tech Agency (German federal government) donated about 200k€ to the project. Not a brand though. https://en.m.wikipedia.org/wiki/Sovereign_Tech_Agency
Step 1: Set up a GoFundMe
Step 2: Announce that, until the aforementioned GoFundMe reaches $10 million, all new commits to curl will be licensed under the AGPL.
Step 3: Profit
Step 3: get forked and lose?
Step 4: it's someone else's problem, win
You can use LLMs as part of the process of identifying bugs, developing features, etc. but you must verify the results. Accepting what the LLM says without testing, checking, and verifying the output is lazy and likely to produce errors, or make the code harder to maintain, e.g. if what the LLM produces isn't in line with the project's development/formatting standards or changes other parts of the code.
Generally speaking, the second you realize a technology/process/anything has a hard requirement that individuals independently exercise responsibility or self-control, with no obvious immediate gain for themselves, it is almost certain that said technology/process/anything is unsalvageable in its current form.
This is in the general case. But with LLMs, the entire selling point is specifically offloading "reasoning" to them. That is quite literally what they are selling you. So with LLMs, you can swap out "almost certain" in the above rule to "absolutely certain without a shadow of a doubt". This isn't even a hypothetical as we have experimental evidence that LLMs cause people to think/reason less. So you are at best already starting at a deficit.
But more importantly, this makes the entire premise of using LLMs make no sense (at least from a marketing perspective). What good is a thinking machine if I need to verify it? Especially when you are telling me that it will be a "super reasoning" machine soon. Do I need a human "super verifier" to match? In fact, that's not even a tomorrow problem, that is a today problem: LLMs are quite literally advertised to me as a "PhD in my pocket". I don't have a PhD. Most people would find the idea of me "verifying the work of human PhDs" to be quite silly, so how does it make any sense that I am in any way qualified to verify my robo-PhD? I pay for it precisely because it knows more than I do! Do I now need to hire a human PhD to verify my robo-PhD?" Short of that, is it the case that only human PhDs are qualified to use robo-PhDs? In other words, should LLms exclusively be used for things the operator already knows how to do? That seems weird. It's like a Magic 8 Ball that only answers questions you already know the answer to. Hilariously, you could even find someone reaching the conclusion of "well, sure, a curl expert should verify the patch I am submitting to curl. That's what submitting the patch accomplishes! The experts who work on curl will verify it! Who better to do it than them?". And now we've come full circle!
To be clear, each of these questions has plenty of counter-points/workarounds/etc. The point is not to present some philosophical gotcha argument against LLM use. The point rather is to demonstrate the fundamental mismatch between the value-proposition of LLMs and their theoretical "correct use", and thus demonstrate why it is astronomically unlikely for them to ever be used correctly.
I use coding LLMs as a mix of:
1. a better autocomplete -- here the LLM models can make mistakes, but on balance I've found this useful, especially when constructing tests, writing output in a structured format, etc.;
2. a better search/query tool -- I've found answers by being able to describe what I'm trying to do where a traditional search I have to know the right keywords to try. I can then go to the documentation or search if I need additional help/information;
3. an assistant to bounce ideas off -- this can be useful when you are not familiar with the APIs or configuration. It still requires testing the code, seeing what works, seeing what doesn't work. Here, I treat it in the same way as reading a blog post on a topic, etc. -- the post may be outdated, may contain issues, or may not be quite what I want. However, it can have enough information for me to get the answer I need -- e.g. a particular method which I can then consult docs (such as documentation comments on the APIs) etc. Or it lets be know what to search on Google, etc..
In other words, I use LLMs as part of the process like with going to a search engine, stackoverflow, etc.
There’s some corollary here to self-driving cars which need constant baby-sitting.
>>Companies tend to assume that somebody else is paying for the development of open-source software, so they do not have to contribute.
I think if you are a billion dollar company using these tools, sponsoring maintenance isn't a lot to ask.
Curiously enough this came up even during the days of Perl.
I don't think Perl got its due, especially given the fact that even until most recently almost everything of importance was done with Perl. Heck internet was made possible because of Perl.
> I think if you are a billion dollar company using these tools, sponsoring maintenance isn't a lot to ask.
It isn't a lot to ask, but it's challenging to 1) find who to ask, and 2) get them to care about the long-term view in a way that doesn't fit into short-term thinking and budgeting.
Being the one car maker on a slide being called out to have supported curl would be so cheap and get them lots of attention.
I've often asked how my company could support them. Most I ask don't understand the question. Those that do only point out that I can contribute code changes - which I have but rarely as we pick good projects that meet our needs: there rarely are bugs or features we would care about enough to not do our regular work.
what would be nice is a non profit that would take money and distribute it to the projects we use - likely with some legal checking that they are legal (whatever that means). FSF is the only one I know of that does generic development and they have ideas that companies generally oppose and so are out
A lot of open source maintainers are bad at asking for money, and most companies find it very hard to give money away without some kind of formal arrangement in place.
Here's a way you can work around that, if you are someone who works for a company with money:
Contact the maintainers of software you use and invite them to speak to your engineering team via Zoom in exchange for a speaking fee.
Your company knows how to pay consultants. It likely also has an existing training budget you can tap into.
You're not asking the maintainer to give a talk - those take time to prepare and require experience with public speaking.
Instead, set it up as a Q&A or a fireside chat. Select someone from your own team who is good at facilitating / asking questions.
Aim for an hour of time. Pay four figures.
Then do the same thing once a month or so for other projects you depend on.
I really like the idea of normalizing companies reaching out to maintainers and offering them a relatively easy hour long remote consultation in exchange for a generous payment. I think this may be a discreet way to help funnel money into the pockets of people who's work a company depends on.
This is very creative, and I suspect would work.
It does have the side effect of wasting the time of 1+n engineers for that hour. I might be able to rustle up a few in month 1, but I'm not going to ba able to do it monthly.
Frankly, as long as the builder has a "support contract" option, that should be sufficient.
I will add that understanding how business works is a huge help to them to get you paid. I advocated for supporting a project (they have a "sponsored by" marketing on their web page, so we could take it out the marketing budget.) But they could only be paid via PayPal (which unfortunately we can't do) do the deal fell through.
It didn't help that the home page in question contained lot of sarcasm, and was antagonistic in tone, likely (I suspect) because of the nonsense the maintainer had to wade through. Ultimately no money got sent.
I'm happy to support OSS, but I can only spend so much social capital on doing so. My advice to maintainers, if you want sponsorship, put some effort into making that channel professional. It really helps.
Many projects have foundations or fiscal sponsors you can work with.
If you care about Python, you could support the Python Foundation, and/or hire or sponsor some Python developers. If you care about Rust, support the Rust Foundation, and/or hire or sponsor some Rust developers. If you care about Reproducible Builds, or QEMU, or Git, or Inkscape, or the future of FOSS licensing, or various other projects (https://sfconservancy.org/projects/current/), support Software Freedom Conservancy.
If you care about a smaller project, and they don't have a means of sponsorship, you could encourage them to accept sponsorship via some means, or join some fiscal sponsor umbrella like Conservancy.
Another such umbrella organization is Software in the Public Interest (SPI). Some of the more notable projects they sponsor include Arch Linux, Debian, FFmpeg, LibreOffice, OpenSSL, OpenZFS, PostgreSQL, and systemd.
Homepage: https://www.spi-inc.org/
I'd like if it was an option on github to easily have a billing option that would have an automatic donation to the open source in the active repos.
It would be cool to build a "library clout" measure for all open source software. First collect for all deployed software systems measures of usage per platform and along other interesting dimensions like how that system relates to others (is it a common dependency or platform for other deployed software). Use this to generate "clout" at a deployed software unit level. Then detect all open source libraries compiled in it by binary signature matching or through the software's own build system if it is open. Then a library's "clout" is built from the clout of the projects that use it.
This clout score might be used to guide investments in a non-profit for funding critical OSS. Data collection would be challenging though, as would callibrating need.
Basically make a rigorous score to track some of the intuition from https://xkcd.com/2347/
There is one, though, focused on security: https://openssf.org/projects/criticality-score/
Sounds like tidelift
Just have a policy of firing these "security researchers" whenever they submit AI generated BS to curl.
Fire them from where? Their undergraduate studies at IIT Hyderabad? Daniel doesn't have the authority to do that.
There are indeed missing steps and I won't claim to reinvent society in one hn thread, /but/ I am just saying that there needs to be real life consequences for being a shitty student. Is that email coming from EDU?
Submit a complaint to the university. Just make an email template and fight back. It takes a minute to find the student affairs or dean's email. Surely there will be one person in the entire institution who cares.
Every day, if I read HN, I find reasons to just go back to working on PrizeForge
don't mind if you do guv, don't mind at all.