But do i find myself conflicted about dismissing it as a potential technical skill all together.
I have seen comfy-ui workflows that are build in a very complex way, some have the canvas devided in different zones, each having its own prompts. Some have no prompts and extract concepts like composition or color values from other files.
I compare these with collage-art which also exists from pre existing material to create something new.
Such tools take practice, there are choices to be made, there is a creative process but its mostly technological knowledge so if its about such it would be right to call it a technical skill.
The sad reality however, is how easy it is to remove parts of that complexity “because its to hard” and barebones it to simple prompt to output. At which point all technical skill fades and it becomes no different from the online generators you find.
I often find myself being the person in the room with the most knowledge about how Generative AI (and other machine learning) works, so I tend to be in the role of the person who answers questions from people who want to check whether their intuition is correct. Yesterday, when someone asked me whether LLMs have any potential uses, or whether the technology is fundamentally useless, and the way they phrased it allowed me to articulate something better than I had previously been able to.
The TL;DR was that I actually think that LLMs have a lot of promise as a technology, but not like this; the way they are being rolled out indiscriminately, even in domains where it would be completely inappropriate, is actually obstructive to properly researching and implementing these tools in a useful way. The problem at the core is that AI is only being shoved down our throats because powerful people want to make more money, at any cost — as long as they are not the ones bearing that cost. My view is that we won’t get to find out the true promise of the technology until we break apart the bullshit economics driving this hype machine.
I agree that even today, it’s possible for the tools to be used in a way that’s empowering for the humans using them, but it seems like the people doing that are in the minority. It seems like it’s pretty hard for a tech layperson to do that kind of stuff, not least of all because most people struggle to discern the bullshit from the genuinely useful (and I don’t blame them for being overwhelmed). I don’t think the current environment is conducive towards people learning to build those kinds of workflows. I often use myself as a sort of anti-benchmark in areas like this, because I am an exceedingly stubborn person who likes to tinker, and if I find it exhausting to learn how to do, it seems unreasonable to expect the majority of people to be able to.
I like the comic’s example of Photoshop’s background remover, because I doubt I’d know as many people who make cool stuff in Photoshop without helpful bits of automation like that (“cool stuff” in this case often means amusing memes or jokes, but for many, that’s the starting point in continuing to grow). I’m all for increasing the accessibility of an endeavour. However, the positive arguments for Generative AI often feels like it’s actually reinforcing gatekeeping rather than actually increasing accessibility; it implicitly divides people into the static categories of Artist, and Non-Artist, and then argues that Generative AI is the only way for Non-Artists to make art. It seems to promote a sense of defeatism by suggesting that it’s not possible for a Non-Artist to ever gain worthwhile levels of skill. As someone who sits squarely in the grey area between “artist” and “non-artist”, this makes me feel deeply uncomfortable.
I actually had a friend who jokingly mocked me for liking ai because i was initially very exited ablut Dall-E and ChatGPT 3.5
Back then i could only see the potential that it continues to have. OpenAI appeared to have altruistic goals and was a non profit. Trojan horse it turned out to be.
Had to make pretty clear to my friend that “ yes, but not like this, everything but this” about the current slop situation.
All of that’s great and everything, but at the end of the day all of the commercial VLM art generators are trained on stolen art. That includes most of the VLMs that comfyui uses as a backend. They have their own cloud service now, that ties in with all the usual suspects.
So even if it has some potentially genuine artistic uses I have zero interest in using a commercial entity in any way to ‘generate’ art that they’ve taken elements for from artwork they stole from real artists. Its amoral.
If it’s all running locally on open source VLMs trained only on public data, then maybe - but that’s what… a tiny, tiny fraction of AI art? In the meantime I’m happy to dismiss it altogether as Ai slop.
How is that any different from “stealing” art in a collage, though? While courts have disagreed on the subject (in particular there’s a big difference between visual collage and music sampling with the latter being very restricted) there is a clear argument to be made that collage is a fair use of the original works, because the result is completely different.
Sidestepping the debate about whether AI art is actually fair use, I do find the fair use doctrine an interesting lens to look at the wider issue — in particular, how deciding whether something is fair use is more complex than comparing a case to a straightforward checklist, but a fairly dynamic spectrum.
It’s possible that something could be:
Highly transformative
Takes from a published work that is primarily of a factual nature (such as a biography)
Distributed to a different market than the original work
but still not be considered fair use, if it had used the entirety of the base work without modification (in this case, the “highly transformative” would pertain to how the chunks of the base work are presented)
I’m no lawyer, but I find the theory behind fair use pretty interesting. In practice, it leaves a lot to be desired (the way that YouTube’s contentID infringes on what would almost certainly be fair use, because Google wants to avoid being taken to court by rights holders, so preempts the problem by being overly harsh to potential infringement). However, my broad point is that whether a court decides something is fair use relies on a holistic assessment that considers all four of pillars of fair use, including how strongly each apply.
AI trained off of artist’s works is different to making collage of art because of the scale of the scraping — a huge amount of copyrighted work has been used, and entire works of art were used, even if the processing of them were considered to be transformative (let’s say for the sake of argument that we are saying that training an AI is highly transformative). The pillar that AI runs up against the most though is “the effect of the use upon the potential market”. AI has already had a huge impact on the market for artistic works, and it is having a hugely negative impact on people’s ability to make a living through their art (or other creative endeavours, like writing). What’s more, the companies who are pushing AI are making inordinate amounts of revenue, which makes the whole thing feel especially egregious.
We can draw on the ideas of fair use to understand why so many people feel that AI training is “stealing” art whilst being okay with collage. In particular, it’s useful to ask what the point of fair use is? Why have a fair use exemption to copyright at all? The reason is because one of the purposes of copyright is meant to be to encourage people to make more creative works — if you’re unable to make any money from your efforts because you’re competing with people selling your own work faster than you can, then you’re pretty strongly disincentivised to make anything at all. Fair use is a pragmatic exemption carved out because of the recognition that if copyright is overly restrictive, then it will end up making it disproportionately hard to make new stuff. Fair use is as nebulously defined as it is because it is, in theory, guided by the principle of upholding the spirit of copyright.
Now, I’m not arguing that training an AI (or generating AI art) isn’t fair use — I don’t feel equipped to answer that particular question. As a layperson, it seems like current copyright laws aren’t really working in this digital age we find ourselves in, even before we consider AI. Though perhaps it’s silly to blame computers for this, when copyright wasn’t really helping individual artists much even before computers became commonplace. Some argue that we need new copyright laws to protect against AI, but Cory Doctorow makes a compelling argument about how this will just end up biting artists in the ass even worse than the AI. Copyright probably isn’t the right lever to pull to solve this particular problem, but it’s still a useful thing to consider if we want to understand the shape of the whole problem.
As I see it, copyright exists because we, as a society, said we wanted to encourage people to make stuff, because that enriches society. However, that goal was in tension with the realities of living under capitalism, so we tried to resolve that through copyright laws. Copyright presented new problems, which led to the fair use doctrine, which comes with problems of its own, with or without AI. The reason people consider AI training to be stealing is because they understand AI as a dire threat to the production of creative works, and they attempt to articulate this through the familiar language of copyright. However, that’s a poor framework for addressing the problem that AI art poses though. We would be better to strip this down to the ethical core of it so we can see the actual tension that people are responding to.
Maybe we need a more radical approach to this problem. One interesting suggestion that I’ve seen is that we should scrap copyright entirely and implement a generous universal basic income (UBI) (and other social safety nets). If creatives were free to make things without worrying about fulfilling basic living needs, it would make the problem of AI scraping far lower stakes for individual creatives. One problem with this is that most people would prefer to earn more than what even a generous UBI would provide, so would probably still feel cheated by Generative AI. However, the argument is that GenerativeAI cannot compare to human artists when it comes to producing novel or distinctive art, so the most reliable wa**y to obtain meaningful art would be to give financial support to the artists (especially if an individual is after something of a particular style). I’m not sure how viable this approach would be in practice, but I think that discussing more radical ideas like this is useful in figuring what the heck to do.
Collage art retains the original components of the art, adding layers the viewer can explore and seek the source of, if desired.
VLMs on the other hand intentionally obscure the original works by sending them through filters and computer vision transformations to make the original work difficult to backtrace. This is no accident, its designed obfuscation.
The difference is intent - VLMs literally steal copies of art to generate their work for cynical tech bros. Classical collages take existing art and show it in a new light, with no intent to pass off the original source materials as their own creations.
The original developers of Stable Diffusion and similar models made absolutely no secret about the source data they used. Where are you getting this idea that they “intentionally obscure the original works… to make [them] difficult to backtrace.”? How would an image generation model even work in a way that made the original works obvious?
Literally steal
Copying digital art wasn’t “literally stealing” when the MPAA was suing Napster and it isn’t today.
For cynical tech bros
Stable Diffusion was originally developed by academics working at a University.
Your whole reply is pretending to know intent where none exists, so if that’s the only difference you can find between collage and AI art, it’s not good enough.
If you download a checkpoint from non trustworthy sources definitely and that is the majority of people, but also the majority that does not use the technical tools that deep nor cares about actual art (mostly porn if the largest distributor of models civitai is a reference).
The technical tool that allow actual creativity is called comfyui, and this is open source. I have yet to see anything that is even comparable. Other creative tools (like the krita plugin) use it as a backend.
I am willing to believe that someone with a soul for art and complex flows would also make their own models, which naturally allows much more creativity and is not that hard to do.
Eh, I’m not so sure on that. I often find myself tripping up on the xkcd Average Familiarity problem, so I worry that this assumption is inadvertently a bit gatekeepy.
It’s the unfortunate reality that modern tech makes it pretty hard for a person to learn the kind of skills necessary to be able to customise one’s own tools. As a chronic tinkerer, I find it easy to underestimate how overwhelming it must feel for people who want to learn but have only ever learned to interface with tech as a “user”. That kind of background means that it requires a pretty high level of curiosity and drive to learn, and that’s a pretty high bar to overcome. I don’t know how techy you consider yourself to be, but I’d wager that anyone who cares about whether something is open source is closer to a techy person than the average person.
For a person who already actively uses comfyui, knows how the different nodes work,
Makes complex flows with them,
Making their own checkpoints is not a big step up.
I have not gotten to this level myself yet, i am still learning how to properly using different and custom nodes, and yes
In the mean time yes, i experiment with public models that use stolen artwork. But i am not posting any of the results, its pure personal use practice.
I have already seen some stuff about making your own models/checkpoints, if i ever get happy enough with my skills to post it as art then having my own feels like a must. The main reason i haven’t is cause it does take a lot of time to prepare the training data.
People that don’t use their models while calling themselves artist are cheating themselves most of all.
I think there’s a stark difference between crafting your own comfyui workflow, getting the right nodes and control nets and checkpoints and whatever, tweaking it until you get what you want, and someone telling an AI “make me a picture/video of X.”
The least AI-looking AI art is the kind that someone took effort to make their own. Just like any other tool.
Unfortunately, gen AI is a tool that gives relatively good results without any skill at all. So most people won’t bother to do the work to make it their own.
I think that, like nearly everything in life, there is nuance to this. But at the same time, we aren’t ready for the nuance because we’re being drowned by slop and it’s horrible.
That was a beautiful read.
But do i find myself conflicted about dismissing it as a potential technical skill all together.
I have seen comfy-ui workflows that are build in a very complex way, some have the canvas devided in different zones, each having its own prompts. Some have no prompts and extract concepts like composition or color values from other files.
I compare these with collage-art which also exists from pre existing material to create something new.
Such tools take practice, there are choices to be made, there is a creative process but its mostly technological knowledge so if its about such it would be right to call it a technical skill.
The sad reality however, is how easy it is to remove parts of that complexity “because its to hard” and barebones it to simple prompt to output. At which point all technical skill fades and it becomes no different from the online generators you find.
I get what you’re saying.
I often find myself being the person in the room with the most knowledge about how Generative AI (and other machine learning) works, so I tend to be in the role of the person who answers questions from people who want to check whether their intuition is correct. Yesterday, when someone asked me whether LLMs have any potential uses, or whether the technology is fundamentally useless, and the way they phrased it allowed me to articulate something better than I had previously been able to.
The TL;DR was that I actually think that LLMs have a lot of promise as a technology, but not like this; the way they are being rolled out indiscriminately, even in domains where it would be completely inappropriate, is actually obstructive to properly researching and implementing these tools in a useful way. The problem at the core is that AI is only being shoved down our throats because powerful people want to make more money, at any cost — as long as they are not the ones bearing that cost. My view is that we won’t get to find out the true promise of the technology until we break apart the bullshit economics driving this hype machine.
I agree that even today, it’s possible for the tools to be used in a way that’s empowering for the humans using them, but it seems like the people doing that are in the minority. It seems like it’s pretty hard for a tech layperson to do that kind of stuff, not least of all because most people struggle to discern the bullshit from the genuinely useful (and I don’t blame them for being overwhelmed). I don’t think the current environment is conducive towards people learning to build those kinds of workflows. I often use myself as a sort of anti-benchmark in areas like this, because I am an exceedingly stubborn person who likes to tinker, and if I find it exhausting to learn how to do, it seems unreasonable to expect the majority of people to be able to.
I like the comic’s example of Photoshop’s background remover, because I doubt I’d know as many people who make cool stuff in Photoshop without helpful bits of automation like that (“cool stuff” in this case often means amusing memes or jokes, but for many, that’s the starting point in continuing to grow). I’m all for increasing the accessibility of an endeavour. However, the positive arguments for Generative AI often feels like it’s actually reinforcing gatekeeping rather than actually increasing accessibility; it implicitly divides people into the static categories of Artist, and Non-Artist, and then argues that Generative AI is the only way for Non-Artists to make art. It seems to promote a sense of defeatism by suggesting that it’s not possible for a Non-Artist to ever gain worthwhile levels of skill. As someone who sits squarely in the grey area between “artist” and “non-artist”, this makes me feel deeply uncomfortable.
We are on the same base,
I actually had a friend who jokingly mocked me for liking ai because i was initially very exited ablut Dall-E and ChatGPT 3.5
Back then i could only see the potential that it continues to have. OpenAI appeared to have altruistic goals and was a non profit. Trojan horse it turned out to be.
Had to make pretty clear to my friend that “ yes, but not like this, everything but this” about the current slop situation.
All of that’s great and everything, but at the end of the day all of the commercial VLM art generators are trained on stolen art. That includes most of the VLMs that comfyui uses as a backend. They have their own cloud service now, that ties in with all the usual suspects.
So even if it has some potentially genuine artistic uses I have zero interest in using a commercial entity in any way to ‘generate’ art that they’ve taken elements for from artwork they stole from real artists. Its amoral.
If it’s all running locally on open source VLMs trained only on public data, then maybe - but that’s what… a tiny, tiny fraction of AI art? In the meantime I’m happy to dismiss it altogether as Ai slop.
How is that any different from “stealing” art in a collage, though? While courts have disagreed on the subject (in particular there’s a big difference between visual collage and music sampling with the latter being very restricted) there is a clear argument to be made that collage is a fair use of the original works, because the result is completely different.
Sidestepping the debate about whether AI art is actually fair use, I do find the fair use doctrine an interesting lens to look at the wider issue — in particular, how deciding whether something is fair use is more complex than comparing a case to a straightforward checklist, but a fairly dynamic spectrum.
It’s possible that something could be:
I’m no lawyer, but I find the theory behind fair use pretty interesting. In practice, it leaves a lot to be desired (the way that YouTube’s contentID infringes on what would almost certainly be fair use, because Google wants to avoid being taken to court by rights holders, so preempts the problem by being overly harsh to potential infringement). However, my broad point is that whether a court decides something is fair use relies on a holistic assessment that considers all four of pillars of fair use, including how strongly each apply.
AI trained off of artist’s works is different to making collage of art because of the scale of the scraping — a huge amount of copyrighted work has been used, and entire works of art were used, even if the processing of them were considered to be transformative (let’s say for the sake of argument that we are saying that training an AI is highly transformative). The pillar that AI runs up against the most though is “the effect of the use upon the potential market”. AI has already had a huge impact on the market for artistic works, and it is having a hugely negative impact on people’s ability to make a living through their art (or other creative endeavours, like writing). What’s more, the companies who are pushing AI are making inordinate amounts of revenue, which makes the whole thing feel especially egregious.
We can draw on the ideas of fair use to understand why so many people feel that AI training is “stealing” art whilst being okay with collage. In particular, it’s useful to ask what the point of fair use is? Why have a fair use exemption to copyright at all? The reason is because one of the purposes of copyright is meant to be to encourage people to make more creative works — if you’re unable to make any money from your efforts because you’re competing with people selling your own work faster than you can, then you’re pretty strongly disincentivised to make anything at all. Fair use is a pragmatic exemption carved out because of the recognition that if copyright is overly restrictive, then it will end up making it disproportionately hard to make new stuff. Fair use is as nebulously defined as it is because it is, in theory, guided by the principle of upholding the spirit of copyright.
Now, I’m not arguing that training an AI (or generating AI art) isn’t fair use — I don’t feel equipped to answer that particular question. As a layperson, it seems like current copyright laws aren’t really working in this digital age we find ourselves in, even before we consider AI. Though perhaps it’s silly to blame computers for this, when copyright wasn’t really helping individual artists much even before computers became commonplace. Some argue that we need new copyright laws to protect against AI, but Cory Doctorow makes a compelling argument about how this will just end up biting artists in the ass even worse than the AI. Copyright probably isn’t the right lever to pull to solve this particular problem, but it’s still a useful thing to consider if we want to understand the shape of the whole problem.
As I see it, copyright exists because we, as a society, said we wanted to encourage people to make stuff, because that enriches society. However, that goal was in tension with the realities of living under capitalism, so we tried to resolve that through copyright laws. Copyright presented new problems, which led to the fair use doctrine, which comes with problems of its own, with or without AI. The reason people consider AI training to be stealing is because they understand AI as a dire threat to the production of creative works, and they attempt to articulate this through the familiar language of copyright. However, that’s a poor framework for addressing the problem that AI art poses though. We would be better to strip this down to the ethical core of it so we can see the actual tension that people are responding to.
Maybe we need a more radical approach to this problem. One interesting suggestion that I’ve seen is that we should scrap copyright entirely and implement a generous universal basic income (UBI) (and other social safety nets). If creatives were free to make things without worrying about fulfilling basic living needs, it would make the problem of AI scraping far lower stakes for individual creatives. One problem with this is that most people would prefer to earn more than what even a generous UBI would provide, so would probably still feel cheated by Generative AI. However, the argument is that GenerativeAI cannot compare to human artists when it comes to producing novel or distinctive art, so the most reliable wa**y to obtain meaningful art would be to give financial support to the artists (especially if an individual is after something of a particular style). I’m not sure how viable this approach would be in practice, but I think that discussing more radical ideas like this is useful in figuring what the heck to do.
Collage art retains the original components of the art, adding layers the viewer can explore and seek the source of, if desired.
VLMs on the other hand intentionally obscure the original works by sending them through filters and computer vision transformations to make the original work difficult to backtrace. This is no accident, its designed obfuscation.
The difference is intent - VLMs literally steal copies of art to generate their work for cynical tech bros. Classical collages take existing art and show it in a new light, with no intent to pass off the original source materials as their own creations.
The original developers of Stable Diffusion and similar models made absolutely no secret about the source data they used. Where are you getting this idea that they “intentionally obscure the original works… to make [them] difficult to backtrace.”? How would an image generation model even work in a way that made the original works obvious?
Copying digital art wasn’t “literally stealing” when the MPAA was suing Napster and it isn’t today.
Stable Diffusion was originally developed by academics working at a University.
Your whole reply is pretending to know intent where none exists, so if that’s the only difference you can find between collage and AI art, it’s not good enough.
only a note: LLMs are for text
Thanks. I edited
If you download a checkpoint from non trustworthy sources definitely and that is the majority of people, but also the majority that does not use the technical tools that deep nor cares about actual art (mostly porn if the largest distributor of models civitai is a reference).
The technical tool that allow actual creativity is called comfyui, and this is open source. I have yet to see anything that is even comparable. Other creative tools (like the krita plugin) use it as a backend.
I am willing to believe that someone with a soul for art and complex flows would also make their own models, which naturally allows much more creativity and is not that hard to do.
Eh, I’m not so sure on that. I often find myself tripping up on the xkcd Average Familiarity problem, so I worry that this assumption is inadvertently a bit gatekeepy.
It’s the unfortunate reality that modern tech makes it pretty hard for a person to learn the kind of skills necessary to be able to customise one’s own tools. As a chronic tinkerer, I find it easy to underestimate how overwhelming it must feel for people who want to learn but have only ever learned to interface with tech as a “user”. That kind of background means that it requires a pretty high level of curiosity and drive to learn, and that’s a pretty high bar to overcome. I don’t know how techy you consider yourself to be, but I’d wager that anyone who cares about whether something is open source is closer to a techy person than the average person.
I should nuance,
For a person who already actively uses comfyui, knows how the different nodes work,
Makes complex flows with them,
Making their own checkpoints is not a big step up.
I have not gotten to this level myself yet, i am still learning how to properly using different and custom nodes, and yes
In the mean time yes, i experiment with public models that use stolen artwork. But i am not posting any of the results, its pure personal use practice.
I have already seen some stuff about making your own models/checkpoints, if i ever get happy enough with my skills to post it as art then having my own feels like a must. The main reason i haven’t is cause it does take a lot of time to prepare the training data.
People that don’t use their models while calling themselves artist are cheating themselves most of all.
I think there’s a stark difference between crafting your own comfyui workflow, getting the right nodes and control nets and checkpoints and whatever, tweaking it until you get what you want, and someone telling an AI “make me a picture/video of X.”
The least AI-looking AI art is the kind that someone took effort to make their own. Just like any other tool.
Unfortunately, gen AI is a tool that gives relatively good results without any skill at all. So most people won’t bother to do the work to make it their own.
I think that, like nearly everything in life, there is nuance to this. But at the same time, we aren’t ready for the nuance because we’re being drowned by slop and it’s horrible.