Hacker News new | past | comments | ask | show | jobs | submit | kmavm's comments login

(Disclaimer: I'm an investor in Krea AI.)

When Diego first showed me this animation, I wasn't completely sure what I was looking at, because I assumed the left and right sides were like composited together or something. But it's a unified screen recording; the right, generated side is keeping pace with the riffing the artist does in the little paint program on the left.

There is no substitute for low latency in creative tools; if you have to sit there holding your breath every time you try something, you aren't just linearly slowed down. There are points that are just too hard to reach in slow, deliberate, 30+ second steps that a classical diffusion generation requires.

When I first heard about consistency, my assumption was that it was just an accelerator. I expected we'd get faster, cheaper versions of the same kinds of interactions with visual models we're used to seeing. The fine hackers at Krea did not take long to prove me wrong!


Exactly.

There is no substitute for real-time when you're doing creative work.

That's why GitHub Copilot works so well; that's why ChatGPT struck a chord with people—it streamed the characters back to you quite fast.

At first, I was skeptical too. I asked myself “what about Photoshop 1.0? They surely couldn't do it in real-time.”. It turns out that even then you needed it. Of course, compute wasn't there to do a simple translation of all rasterized pixel values that form an image within a layer, but there was a trick they did: they showed you the outline that would tell you, the user, where the content _will_ render if you let the mouse go.

You can see the workflow here:

> https://www.youtube.com/watch?v=ftaIzyrMDqE

It applies to general tools too; you can see the same on this MacOS 8 demo (it runs on the browser!):

> https://infinitemac.org/1998/Mac%20OS%208.1


> that's why ChatGPT struck a chord with people—it streamed the characters back to you quite fast.

So did GPT-3, though. ChatGPT (3.5) was a bit faster, but not overly so.


And it did blow up! But not as much as changing the UI towards a (familiar) chat interface.

Good point! I agree with it but forgot to mention it: interaction matters.

With GitHub Copilot you are in familiar terrain, your code editor; with ChatGPT, you are talking to it the same way you'd talk to an assistant, via chat/email.

And we, at KREA, don't think it'll be the exception for AI for creativity.


That's definitely true, the chat format (vs the completion format) made all the difference. So much so that ChatGPT blew up even though it was inferior in capabilities to GPT-3, just because it was (much) more usable.


As an investor, I hope you’re ready to bankroll the inevitable legal battles. These are not going to be restricted to the big players. Eleuther was recently sued, and they’re a non profit.

The moment you try to market this, you need to be prepared for the lawsuit. I’m preparing for one, and all I did was assemble a dataset. This model is built off of work which most people (rightly or wrongly) believe is not yours to sell.

I’m still not sure how I feel about it. I was forced to confront the question a few days ago, and I’ve been in a holding pattern since then. I’m not so much concerned about the lawsuits as getting the big question right. Ethics has a funny way of sneaking up on you in the long run.

At the very least, be prepared for a lengthy, grizzly smear campaign. Two people wrote stories insinuating I somehow profited off of books3. Your crew will be profiting with intent.

One reason I’ve considered bowing out of ML is that I’d rather not be verbally spit on for the rest of eternity. It’s nice to have the support of colleagues, but unless you really care solely about money, you’ll be classified in the same bucket as Zuck: widely respected if successful by the people that matter, but never able to hold a normal relationship again. Most people probably prefer that tradeoff, but go into this with eyes wide open: you will be despised.

The way out is to help train a model on Creative Commons images. I don’t know if there’s enough data. And it’s certainly a bad idea to wait; your only chance of dominating this market is to iterate quickly, which means using existing models. But at this point, lawsuits are table stakes. You need to be prepared for when they happen, not if.

Also, join me in at least one sleepless night pondering the ethics of profiting off of this. Normally people only mention this as a social signal, not because they actually care. But if you sit down and think it through from first principles, the ethics — legality aside - is not at all clear. This also isn’t a case of a Snowmaker startup (https://x.com/snowmaker/status/1696026604030595497?s=61&t=jQ... he notes that this only works when you have the general population on your side. All of those examples are of startups violating the laws that people felt were dumb. Whereas I can tell you from firsthand trauma that copyright enthusiasts are religiously fanatical. Worse, they might be on the right side of the ethics question.

This was the first time in my life that a startup’s ethics gave me pause. Not just yours, but everyone who’s building creative tools off of these models. You’ll face a stiff headwind. Valve, for example, won’t approve any game containing any work generated by your tools. And everyone else is trying to build their own moat.

I’m not saying to consider giving up. I’m saying, really sit down and go through the mental exercise of deciding if this is a battle you want to fight for at least three years legally and five years socially. I’m happy to provide examples of the type of abuse you and your team will face, ranging from sticks and stones’ level insults to people directly calling for criminal liability (jail time). The latter is exceedingly unlikely, but being ostracized by the general public is not.

At the very least, you’ll need to have a solid answer prepared if you start hiring people and candidates ask for your stance. This comment is as much for your team as for you as an investor, since all of you will face these questions together.


Hi sillysaurusx. I'd love to get into contact with you. As someone who contributes dataset to academic NLP, you have very unique and interesting perspectives on this question.

I can't reach out to you via twitter as I am not a verified member, so I will reach out via email.


What’s your Twitter handle? I can DM you.

(It’s unfortunate that there’s no way for me to override that setting. The whole reason I made my DMs public is so people can contact me.)


@Hellisotherpe10

I don't use twitter much. I've emailed you a rather long email already and would prefer that. I emailed shawnpresser@gmail.com


Are you an artist? Would you use this tool?


I'm artist and I use midjourney for faster work. You still need photoshop skills, imagination and aesthetic to get job done.


Taste is still valuable. It will follow a power law dynamic—might finish my essay about it someday.


Taste is... hard to define. I know quite a lot of artists that lack of taste, but nowadays you need to look for originality offline.

Online consumption shape digital creative work significantly. AI wont help here either.


It's "out of favor" because it completely failed as a research program. Let's not equivocate about this; it's nice to understand heuristic search, and there was a time when things like compilation were poorly understood enough to seem like AI. But as a path towards machines that succeed at cognitive tasks, these approaches are like climbing taller and taller trees in the hopes of getting to the moon.


Slight correction: HipHop for PHP was cleanroom, including rewriting large families of native extensions to work with its C++ runtime, although it eventually developed workalikes for the PHP dev headers to ease development. Source: I worked on HHVM, its JIT successor that initially shared its source tree and runtime.


Facebook developers seem to have a surprising amount of free time to go around reinventing things that are not obviously social network features. (Or to have had it in the 2010s, at least.)


It's not particularly surprising that a surprising amount of infrastructure is needed to run a social network.


Note WhatsApp had 35 employees when they were acquired and Instagram had 13. At that size you need to be productive at managing servers but you're probably not thinking how great it'd be to have a "whole new programming language and source control system" team.


WhatsApp and Instagram at the point of acquisition were simpler than Facebook is (and was), or even compared to what it is now. Once you scale you start to need a lot of engineers to help keep things standing up and everyone on the same page.


WhatsApp had like half a billion monthly active users when they were acquired, that could be considered fairly large scale, no? But I agree with your point in general.


Yes, but WhatsApp is a point-to-point communication tool with mostly small groups. Each individual message doesn't need to be distributed to a potentially very large audience like in Facebook, making processing and coordination of nodes smaller and simpler.


Note though that other large projects with similar scaling-git problems tended to just write wrapper tools to work around it, see how Chromium and Android do it.


That’s still Google, though. I suspect non-MetaBet companies would try harder to avoid scaling the development team in the first place.

(My other two posts in this thread are -4 and +7 although they both have the same point. Never sure how to interpret that.)


Android and Chrome are small projects in this context :P


The idea that "virtualization" began with Zen in 2004 is rather difficult to read as an early VMware employee. Before QEMU independently discovered it, VMware was JIT'ing unrestricted x86 to a safe x86 subset from 1999 on[1]. Hardware support for trap-and-emulate virtualization came to the market in the early 'aughts after VMware had proven the market demand for it.

[1] https://www.vmware.com/pdf/asplos235_adams.pdf


When I was at VMware in the 'aughts, VESA often saved us as an unaccelerated option for guests that didn't yet have a driver for our virtual display. Was there really no VESA driver for the 9x family? Or does QEMU's BIOS not do it or something?


BearWindows and SciTech Display Doctor are the two VESA drivers which come to mind for Windows 9x. If I remember correctly Bear will also work in 3.x.

I remember these being somewhat frustrating to get working with VirtualBox. I never tried with QEMU.

I've personally moved away from virtualization for older OSes and to emulation. It just seems much easier to deal with even if it's more resource intensive.


I was Chief Architect at Slack from 2016 to 2020, and was privileged to work with the engineers who were doing the work of migrating to Vitess in that timeframe.

The assumption that tenants are perfectly isolated is actually the original sin of early Slack infrastructure that we adopted Vitess to migrate away from. From some earlier features in the Enterprise product (which joins lots of "little Slacks" into a corporate-wide entity) to more post-modern features like Slack Connect (https://slack.com/help/articles/1500001422062-Start-a-direct...) or Network Shared Channels (https://slack.com/blog/news/shared-channels-growth-innovatio...), the idea that each tenant is fully isolated was increasingly false.

Vitess is a meta-layer on top of MySQL shards that asks, per table, which key to shard on. It then uses that information to maintain some distributed indexes of its own, and to plan the occasional scatter/gather query appropriately. In practice, simply migrating code from our application-sharded, per-tenant old way into the differently-sharded Vitess storage system was not a simple matter of pointing to a new database; we had to change data access patterns to avoid large fan-out reads and writes. The team did a great write-up about it here: https://slack.engineering/scaling-datastores-at-slack-with-v...


> In the fall of 2016, we were dealing with hundreds of thousands of MySQL queries per second and thousands of sharded MySQL hosts in production.

> Today, we serve 2.3 million QPS at peak. 2M of those queries are reads and 300K are writes.

I think the "today" QPS numbers are still doable with a properly tuned single-writer galera cluster running on machines with TBs of memory. Of course, with Slack workload, there would be too much historical data to fit into a single host, so I can see the reasons to shard into multiple clusters/hosts.

Still, the numbers seem a little off. Let's say back in fall 2016 there were already 200K write QPS at peak, with 200 sharded hosts accepting write. That's just 1K write QPS at peak per host on average, and let's say 20K write QPS at peak for a particularly hot shard. What could be the bottleneck? Replication lag? Data size? I don't think any of the articles from Slack has talked about this.

What Vitess provides is invaluable, especially the very solid implementation of secondary index. But sometimes I feel like it is being used/advocated as a sledgehammer ("just keep sharding") without looking at what could be done better at the lower MySQL/InnODB level, in exchange for a much more costly cloud bill.


Definitely wasn't expecting the chief architect at Slack to reply to that example, really appreciate the response, HN is such a blessing in that regard :). The scaling datastores at slack is a super interesting read aswell thanks, does make me wonder if there was a fully 100% MySQL compatible version of yugabyte/spanner etc if that would have shifted the decision.


Random aside, were you at KubeCon a couple years ago chatting with Sugu at the whole conference party in San Diegi? If so, hi! I was crazy out of my depth, but listening to folks that know this stuff better than I ever will was one of the highlights of that conference


This is accurate: Slack is exclusively using Hack/HHVM for its application servers.

HHVM has an embedded web server (the folly project's Proxygen), and can directly terminate HTTP/HTTPS itself. Facebook uses it in this way. If you want to bring your own webserver, though, FastCGI is the most practical way to do so with HHVM.


Hi! We didn't do a great job writing about this at the time, but Slack's migration into HHVM took place in 2016. We've been gradually increasing the coverage of Hacklang (Facebook's gradually typed descendant of PHP) since then, and are now 100% Hacklang.


I know that the HHVM team at facebook is implementing a pretty aggressive strategy to updating the language with regard to deprecating language features, how is this affecting you guys? How often are you guys updating to the latest version of hacklang/HHVM?


We decided early on to colocate most aspects of the back-end, in part because we anticipated shared channels[1], but also because provisioning even virtual hardware for each team would be prohibitively expensive: we have over 600,000 organizations in Slack today[2], too many to make hard-partitioning most resources economical.

[1] https://www.zdnet.com/article/slack-brings-shared-channels-t... [2] https://sec.report/Document/0001628280-19-004786/


A lot of VC funds are required to liquidate and return proceeds to LPs in the event of an exit, whether through IPO or acquisition.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: