

Do you have a link? Because if you tell me you found it but don’t tell me where, then that may put you at risk of being in the same ring of hell as people who comment “nvm, solved it” to their tech forum posts.


Do you have a link? Because if you tell me you found it but don’t tell me where, then that may put you at risk of being in the same ring of hell as people who comment “nvm, solved it” to their tech forum posts.


As a society, we need to better value the labour that goes into our collective knowledge bases. Non-English Wikipedia is just one example of this, but it highlights the core of the problem: the system relies on a tremendous amount of skilled labour that cannot easily be done by just a few volunteers.
Paying people to contribute would come with problems of its own (in a hypothetical world where this was permitted by Wikipedia, which I don’t believe it is at present), but it would be easier for people to contribute if the time they wanted to volunteer was competing with their need to keep their head above the water financially. Universal basic income, or something similar, seems like one of the more viable ways to improve this tension.
However, a big component of the problem is around the less concrete side of how society values things. I’m a scientist in an area where we are increasingly reliant on scientific databases, such as the Protein Database (pdb), where experimentally determined protein structures are deposited and annotated, as well as countless databases on different genes and their functions. Active curation of these databases is how we’re able to research a gene in one model organism, and then apply those insights to the equivalent gene in other organisms.
For example, the gene CG9536 is a term for a gene found in Drosophila melanogaster — fruit flies, a common model organism for genetic research, due to the ease of working with them in a lab. Much of the research around this particular gene can be found on flybase, a database for D. melanogaster gene research. Despite being super different to humans, there are many fruitfly genes that have equivalents in humans, and CG9536 is no exception; TMEM115 is what we call it in humans. The TL;DR answer of what this gene does is “we don’t know”, because although we have some knowledge of what it does, the tricky part about this kind of research is figuring out how genes or proteins interact as part of a wider system — even if we knew exactly what it does in a healthy person, for example, it’s much harder to understand what kinds of illnesses arise from a faulty version of a gene, or whether a gene or protein could be a target for developing novel drugs. I don’t know much about TMEM115 specifically, but I know someone who was exploring whether it could be relevant in understanding how certain kinds of brain tumours develop. Biological databases are a core component of how we can big to make sense of the bigger picture.
Whilst the data that fill these databases are produced by experimental research that are attached to published papers, there’s a tremendous amount of work that makes all these resources talk to each other. That flybase link above links to the page on TMEM115, and I can use these resources to synthesise research across so many separate fields that would previously have been separate: the folks who work on flies will have a different research culture than those who work in human gene research, or yeast, or plants etc. TMEM115 is also sometimes called TM115, and it would be a nightmare if a scientist reviewing the literature missed some important existing research that referred to the gene under a slightly different name.
Making these biological databases link up properly requires active curation, a process that the philosopher of Science Sabine Leonelli refers to as “data packaging”, a challenging task that includes asking “who else might find this data useful?” [1]. The people doing the experiments that produce the data aren’t necessarily the best people for figuring out how to package and label that data for others to use because inherently, this requires thinking in a way that spans many different research subfields. Crucially though, this infrastructure work gives a scientist far fewer opportunities to publish new papers, which means this essential labour is devalued in our current system of doing science.
It’s rather like how some of the people who are adding poor quality articles to non-English Wikipedia feel like they’re contributing because using automated tools allows them to create more new articles than someone with actual specialist knowledge could. It’s the product of a culture of an ever-hungry “more” that fuels the production of slop, devalues the work of curators and is degrading our knowledge ecosystem. The financial incentives that drive this behaviour play a big role, but I see that as a symptom of a wider problem: society’s desire to easily quantify value causing important work that’s harder to quantify to be systematically devalued (a problem that we also see in how reproductive labour (i.e. the labour involved in managing a family or household) has historically been dismissed).
We need to start recognising how tenuous our existing knowledge is. The OP discusses languages with few native speakers, which likely won’t affect many who read the article, but we’re at risk of losing so much more if we don’t learn to recognise how tenuous our collective knowledge is. The more we learn, the more we need to invest into expanding our systems of knowledge infrastructure, as well as maintaining what we already have.
[1]: I am not going to cite the paper in which Sabine Leonelli coined the phrase “data packaging”, but her 2016 book “Data-Centric Biology: A Philosophical Study”. I don’t imagine that many people will read this large comment of mine, but if you’ve made it this far, you might be interested to check out her work. Though it’s not aimed at a general audience, it’s still fairly accessible, if you’re the kind of nerd who is interested in discussing the messy problem of making a database usable by everyone.
If your appetite for learning is larger than your wallet, then I’d suggest that Anna’s Archive or similar is a good shout. Some communities aren’t cool with directly linking to resources like this, so know that you can check the Wikipedia page of shadow library sites to find a reliable link: https://en.wikipedia.org/wiki/Anna’s_Archive
1 ↩︎


This isn’t useful to me at the moment, but nonetheless, I really appreciate when people like you take the time to share knowledge in this way.
There is a special place in heaven for the people who do this, just as there’s a special place in hell for people who reply to their own forum post about a complex technical problem.
(Side note: I tried the find the xkcd where he is yelling at his computer after finding a forum post from someone who has the exact same computer problem as he does, but the original poster hasn’t updated it. Alas, I couldn’t. If anyone knows which one I mean, I’d appreciate you pointing me to it, because it’ll drive me mad until I remember how to find it)


Exactly. DevOps engineers are already super skilled at using automation where appropriate, but knowing how and when to do that is still an extremely human task


A handful of senior engineers or developers. And then we’re even more ducked when they retire or die, because the no-one is hiring junior engineers or developers


This is supremely silly. I will never use it, but I’m glad that it exists; you’re delightful
“TIL i do
womenpeople things. lol”
The point of “you don’t have to hold your farts in to be a woman” isn’t to suggest that only women fart, but that farting is a thing that people do, and that given that women are a subset of people, women fart (and that farting doesn’t make someone less of a woman)
There’s a balance. I have known plenty of women who felt it was not permissible to fart around people/in public ever. One would not even fart around her husband of 10+ years. Another would only fart when they were at home, in the bathroom. Another felt it was inappropriate to ever fart, even when she was pooping (as a result of this, she once was so constipated that she had to go to the hospital).
Whilst these are particularly extreme examples, they’re just instances of a general trend where women farting is stigmatised more than men farting. I interpret the image in the OP to be resisting that excessive pressure and unrealistic standard rather than advocating for disregarding basic courtesy and farting with impunity


What do you mean by “ideological reasons”? You’re using the phrase as if it’s a bad thing, but I struggle to imagine how anyone could exist in a political role such as FTC chair and not bring their ideology into their work.


“too dumb to understand code requirements in every industry and profession.”
Or selfish. Unfortunately Hanlon’s razor can only cut so deep.


Is it? I didn’t get that sense. What causes you to think it’s written by chatGPT? (I ask because whilst I’m often good at discerning AI content, there are plenty of times that I don’t notice it until someone points out things that they notice that I didn’t initially)


Sometimes, I feel like writers know that it’s capitalism, but they don’t want to actually call the problem what it is, for fear of scaring off people who would react badly to it. I think there’s probably a place for this kind of oblique rhetoric, but I agree with you that progress is unlikely if we continue pussyfooting around the problem


“not that hard to do”
Eh, I’m not so sure on that. I often find myself tripping up on the xkcd Average Familiarity problem, so I worry that this assumption is inadvertently a bit gatekeepy.
It’s the unfortunate reality that modern tech makes it pretty hard for a person to learn the kind of skills necessary to be able to customise one’s own tools. As a chronic tinkerer, I find it easy to underestimate how overwhelming it must feel for people who want to learn but have only ever learned to interface with tech as a “user”. That kind of background means that it requires a pretty high level of curiosity and drive to learn, and that’s a pretty high bar to overcome. I don’t know how techy you consider yourself to be, but I’d wager that anyone who cares about whether something is open source is closer to a techy person than the average person.


Sidestepping the debate about whether AI art is actually fair use, I do find the fair use doctrine an interesting lens to look at the wider issue — in particular, how deciding whether something is fair use is more complex than comparing a case to a straightforward checklist, but a fairly dynamic spectrum.
It’s possible that something could be:
I’m no lawyer, but I find the theory behind fair use pretty interesting. In practice, it leaves a lot to be desired (the way that YouTube’s contentID infringes on what would almost certainly be fair use, because Google wants to avoid being taken to court by rights holders, so preempts the problem by being overly harsh to potential infringement). However, my broad point is that whether a court decides something is fair use relies on a holistic assessment that considers all four of pillars of fair use, including how strongly each apply.
AI trained off of artist’s works is different to making collage of art because of the scale of the scraping — a huge amount of copyrighted work has been used, and entire works of art were used, even if the processing of them were considered to be transformative (let’s say for the sake of argument that we are saying that training an AI is highly transformative). The pillar that AI runs up against the most though is “the effect of the use upon the potential market”. AI has already had a huge impact on the market for artistic works, and it is having a hugely negative impact on people’s ability to make a living through their art (or other creative endeavours, like writing). What’s more, the companies who are pushing AI are making inordinate amounts of revenue, which makes the whole thing feel especially egregious.
We can draw on the ideas of fair use to understand why so many people feel that AI training is “stealing” art whilst being okay with collage. In particular, it’s useful to ask what the point of fair use is? Why have a fair use exemption to copyright at all? The reason is because one of the purposes of copyright is meant to be to encourage people to make more creative works — if you’re unable to make any money from your efforts because you’re competing with people selling your own work faster than you can, then you’re pretty strongly disincentivised to make anything at all. Fair use is a pragmatic exemption carved out because of the recognition that if copyright is overly restrictive, then it will end up making it disproportionately hard to make new stuff. Fair use is as nebulously defined as it is because it is, in theory, guided by the principle of upholding the spirit of copyright.
Now, I’m not arguing that training an AI (or generating AI art) isn’t fair use — I don’t feel equipped to answer that particular question. As a layperson, it seems like current copyright laws aren’t really working in this digital age we find ourselves in, even before we consider AI. Though perhaps it’s silly to blame computers for this, when copyright wasn’t really helping individual artists much even before computers became commonplace. Some argue that we need new copyright laws to protect against AI, but Cory Doctorow makes a compelling argument about how this will just end up biting artists in the ass even worse than the AI. Copyright probably isn’t the right lever to pull to solve this particular problem, but it’s still a useful thing to consider if we want to understand the shape of the whole problem.
As I see it, copyright exists because we, as a society, said we wanted to encourage people to make stuff, because that enriches society. However, that goal was in tension with the realities of living under capitalism, so we tried to resolve that through copyright laws. Copyright presented new problems, which led to the fair use doctrine, which comes with problems of its own, with or without AI. The reason people consider AI training to be stealing is because they understand AI as a dire threat to the production of creative works, and they attempt to articulate this through the familiar language of copyright. However, that’s a poor framework for addressing the problem that AI art poses though. We would be better to strip this down to the ethical core of it so we can see the actual tension that people are responding to.
Maybe we need a more radical approach to this problem. One interesting suggestion that I’ve seen is that we should scrap copyright entirely and implement a generous universal basic income (UBI) (and other social safety nets). If creatives were free to make things without worrying about fulfilling basic living needs, it would make the problem of AI scraping far lower stakes for individual creatives. One problem with this is that most people would prefer to earn more than what even a generous UBI would provide, so would probably still feel cheated by Generative AI. However, the argument is that GenerativeAI cannot compare to human artists when it comes to producing novel or distinctive art, so the most reliable wa**y to obtain meaningful art would be to give financial support to the artists (especially if an individual is after something of a particular style). I’m not sure how viable this approach would be in practice, but I think that discussing more radical ideas like this is useful in figuring what the heck to do.


I get what you’re saying.
I often find myself being the person in the room with the most knowledge about how Generative AI (and other machine learning) works, so I tend to be in the role of the person who answers questions from people who want to check whether their intuition is correct. Yesterday, when someone asked me whether LLMs have any potential uses, or whether the technology is fundamentally useless, and the way they phrased it allowed me to articulate something better than I had previously been able to.
The TL;DR was that I actually think that LLMs have a lot of promise as a technology, but not like this; the way they are being rolled out indiscriminately, even in domains where it would be completely inappropriate, is actually obstructive to properly researching and implementing these tools in a useful way. The problem at the core is that AI is only being shoved down our throats because powerful people want to make more money, at any cost — as long as they are not the ones bearing that cost. My view is that we won’t get to find out the true promise of the technology until we break apart the bullshit economics driving this hype machine.
I agree that even today, it’s possible for the tools to be used in a way that’s empowering for the humans using them, but it seems like the people doing that are in the minority. It seems like it’s pretty hard for a tech layperson to do that kind of stuff, not least of all because most people struggle to discern the bullshit from the genuinely useful (and I don’t blame them for being overwhelmed). I don’t think the current environment is conducive towards people learning to build those kinds of workflows. I often use myself as a sort of anti-benchmark in areas like this, because I am an exceedingly stubborn person who likes to tinker, and if I find it exhausting to learn how to do, it seems unreasonable to expect the majority of people to be able to.
I like the comic’s example of Photoshop’s background remover, because I doubt I’d know as many people who make cool stuff in Photoshop without helpful bits of automation like that (“cool stuff” in this case often means amusing memes or jokes, but for many, that’s the starting point in continuing to grow). I’m all for increasing the accessibility of an endeavour. However, the positive arguments for Generative AI often feels like it’s actually reinforcing gatekeeping rather than actually increasing accessibility; it implicitly divides people into the static categories of Artist, and Non-Artist, and then argues that Generative AI is the only way for Non-Artists to make art. It seems to promote a sense of defeatism by suggesting that it’s not possible for a Non-Artist to ever gain worthwhile levels of skill. As someone who sits squarely in the grey area between “artist” and “non-artist”, this makes me feel deeply uncomfortable.


I liked it, personally. I’ve read plenty of AI bad articles, and I too am burnt out on them. However, what I really appreciated about this was that it felt less like a tirade against AI art and more like a love letter to art and the humans that create it. As I was approaching the ending of the comic, for example, when the argument had been made, and the artist was just making their closing words, I was struck by the simple beauty of the art. It was less the shapes and the colours themselves that I found beautiful, but the sense that I could practically feel the artist straining against the pixels in his desperation to make something that he found beautiful — after all, what would be the point if he couldn’t live up to his own argument?
I don’t know how far you got through, but I’d encourage you to consider taking another look at it. It’s not going to make any arguments you’ve not heard before, but if you’re anything like me, you might appreciate it from the angle of a passionate artist striving to make something meaningful in defiance of AI. I always find my spirits bolstered by work like this because whilst we’re not going to be able to draw our way out of this AI-slop hellscape, it does feel important to keep reminding ourselves of what we’re fighting for.


I’ve been practicing at being a better writer, and one of the ways I’ve been doing that is by studying the writing that I personally really like. Often I can’t explain why I click so much with a particular style of writing, but by studying and attempting to learn how to copy the styles that I like, it feels like a step towards developing my own “voice” in writing.
A common adage around art (and other skilled endeavours) is that you need to know how to follow the rules before you can break them, after all. Copying is a useful stepping stone to something more. It’s always going to be tough to learn when your ambition is greater than your skill level, but there’s a quote from Ira Glass that I’ve found quite helpful:
“Nobody tells this to people who are beginners, I wish someone told me. All of us who do creative work, we get into it because we have good taste. But there is this gap. For the first couple years you make stuff, it’s just not that good. It’s trying to be good, it has potential, but it’s not. But your taste, the thing that got you into the game, is still killer. And your taste is why your work disappoints you. A lot of people never get past this phase, they quit. Most people I know who do interesting, creative work went through years of this. We know our work doesn’t have this special thing that we want it to have. We all go through this. And if you are just starting out or you are still in this phase, you gotta know it’s normal and the most important thing you can do is do a lot of work. Put yourself on a deadline so that every week you will finish one story. It is only by going through a volume of work that you will close that gap, and your work will be as good as your ambitions. And I took longer to figure out how to do this than anyone I’ve ever met. It’s gonna take awhile. It’s normal to take a while. You’ve just gotta fight your way through.”


What makes you want to do art? I’m just curious, because I am also someone who has bounced off of attempting to learn to do art a bunch of times, and found tracing unfulfilling (I am abstaining from the question of whether tracing is art, but I do know it didn’t scratch the itch for me).
For my part, I ended up finding that crafts like embroidery or clothing making was the best way to channel my creative inclinations, but that’s mostly because I have the heart of a ruthless pragmatist and I like making useful things. What was it that caused you to attempt to learn?
One of my favourite jokes I’ve ever been told was from a German friend.
His deadpan delivery had me in stitches. I am convinced that the stereotype about Germans not having a sense of humour is in fact a German invention, so they can mess with that expectation to be hilarious when we least expect it