Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Confluent S-1 (sec.gov)
95 points by mattmarcus on June 2, 2021 | hide | past | favorite | 36 comments


Some interesting stuff about the relationship between enterprise and open source. I know the 'risks' section is always to be taken with a pinch of salt, but I found this interesting.

> Software developers, including those within our customers’ IT departments, are often familiar with our underlying technology and value proposition. We rely on their continued adoption of our offering to evangelize on our behalf within their organizations and increase reach and mindshare within the developer community.

> Actions that we have taken in the past or may take in the future with respect to Apache Kafka or our community license, including the development and growth of our proprietary offering, may be perceived negatively by the developer community and harm our reputation.

They know that there's a fine line between making enough open source to be seen as 'really' open source software. The 'open core' model is sometimes abused where they intentionally cripple the open-source version to prod people towards the closed one.

Also confirms what I have heard from a lot of smaller developer-tools companies - 'traditional' marketing and sales doesn't really work, and they go to pretty big efforts to get support from the bottom up, instead of taking the CIO to a fancy conference.


Can someone ELI5 what confluent is/does?

Going to the landing page of any enterprise B2B Saas company is always bizarre. You can tell nobody cares if the copy makes sense because these types of products get sold in person.


They offer commercial support services/contracts for Apache Kafka.

They sell proprietary Apache Kafka add-ons for enterprise such as authentication/authorisation RBAC. The open source Kafka release supports auth/authz but not consistently between the suite of tools. They also have a monitoring UI etc.

They release the commercial add-ons under the Confluent platform distribution.

They heavily push ksqldb which is a sql layer on top of Kafka that competes with https://lenses.io/. Confluent market it as a DB but that’s pushing the creative marketing.

They provide a hosted Kafka service.

The hosted service is likely more attractive to SMEs. It also has competition with the SME market with AWS providing managed Kafka. Confluents service does have extras on top of AWS like tiered storage behind a non open source license but nothing is stopping any one building there own tiered storage. I can see people using Confluent’s cloud but don’t see it being big.

Large corporations will (have) generally hosted/run their own Kafka.

I like Kafka but can’t see how Confluent can be profitable.

Cloudera is an example, they couldn’t make the service business work, I feel Confluent will have the same issues.

Elastic search have issues with AWS eating some of the market. Confluent have the exact same issues.

The Kafka is your nerve center narrative, that is great from Confluents perspective financially but it’s no different than the old enterprise service bus as the nerve center for everything. There’s a reason we don’t see that anymore.


AWS managed kafka is great but it's really just Kafka. Confluent offers stuff like topic monitoring, an http bridge, and other services that make kafka more practical to actually use (dare I say, missing features.)

My company uses Kafka extensively. We hosted it ourselves but after a few Kafka meltdowns we decided to go with a hosted solution. We compared AWS's offering to Confluent and at the time found that while Confluent had all of the nice tooling, we'd already built our own tooling to do the same thing, and rebuilding our entire infrastructure was a non-starter. So we went with AWS. It was a drop-in replacement for our own Kafka and it hasn't blown up on us since.

If we where starting fresh, Confluent would be more viable. I can't help but wish that the offerings that are part of Confluent where part of Kafka though. Kafka's lack of the tools that Confluent offers are imo what's holding it back from totally taking over the world.


We build a complete drop in replacement for Confluent's metrics/management tooling (https://kpow.io) that fills a very large gap in the engineering experience for Kafka. I say replacement but tbh we offer a lot more in terms of features other than stopping/starting clusters.

It feels obvious to me that AWS will roll-out S3 backed storage and managed Kafka Connect at some point, their recent IAM / ACL integration points to a pretty active MSK team.

I completely agree in the past there were some pretty big gaps - that's why we built kPow - my feeling is those gaps are narrowing pretty significantly.


Kafka S3 dumping would be amazing, could really clean up our tooling. I'll check out your product as well.


Honestly I can't wait for perpetual topics backed by S3 integrated with MSK. If the AWS team aren't working on it I will eat my own shoes.

Let me know how you go with kPow! If you want a guided tour just let me know. :)

ps. we're releasing a major bump to the UI later today..


> I like Kafka but can’t see how Confluent can be profitable.

I've seen what some corporate customers pay for an enterprise license. It's kind of insane.


SaaS S-1s often have a section about this. From Confluent's:

"As of March 31, 2021, we had 561 customers with $100,000 or greater in annual recurring revenue, or ARR, across a wide range of industries, compared to 374 such customers as of March 31, 2020, representing year-over-year growth of 50%. As of March 31, 2021, we had 60 customers with $1.0 million or greater in ARR, compared to 33 such customers as of March 31, 2020, representing year-over-year growth of 82%."


We considered confluent at my last job, where our throughput was going to be miniscule to start but we required VPC peering...that bumped us into an Enterprise plan for something like 80k/year! Went with Aiven instead which was more like 20k/yr IIRC.


The trouble is if you need VPC peering then it requires a dedicated VPC, dedicated k8s, dedicated instances. Confluent currently only offers one instance size, so that dedicated offering can't be scaled down.

PrivateLink allows for shared VPC and hypothetically a shared k8s. So the cost structure can be better on the Confluent side. Of course scaling down instance types would make a huge difference, but I don't think that's supported currently. The cheaper option is multi-tenant Kafka clusters.

In general, I think VPC peering is a dead end for use cases where you are connecting two companies networks together. It's not great operationally and puts a lot of security / filtering burden on each party.

PrivateLink is a much better option.


I'm not too knowledgeable about all this, I'm sure you're right, and it was someone else's requirement that we have peering. Still, we'd grown accustomed to other vendors offering it by default (Timescale Cloud, Aiven) so it was a serious sticker shock and a big part of making us go elsewhere.


Yes it is yet they have massive losses. The money is also likely in the same ballpark of Cloudera / Horntonworks and other companies providing professional services for open source software who have all struggled to make the business model work.


Their consulting when we asked was on the order of $6000 per consultant per day. And they were fully booked.


My beef with their tiered storage is that Confluent gets all the benefits while the customer gets almost none, at least with their hosted service: if you let them offload older messages to S3, they don't charge you any less for them! Can still get perf benefits, though.


> hey heavily push ksqldb which is a sql layer on top of Kafka

Can Kafka replace a database? We have an architect within our company that heavily pushes using Kafka for everything (as a database, a caching layer, message bus etc.,)


Kafka can replace a database in some cases, depending on your query patterns and ability to partition the data intelligently.

It excels at write-heavy workloads and, if you're already using Kafka for streaming or message passing then you can use it as a key-value store in order to avoid the extra operational burden of dealing with a distinct service for caching or storage.

But if you need a data warehouse or just a relational query engine for data at rest, or if your data volume is not too large and you're not already using kafka, you're probably better off with a regular DB.


Thank you.


They're a Kafka SaaS company, offering cloud hosted Kafka instances as well as support.

They also contribute to Kafka and manages various language specific drivers.


Yup, it's founded by a bunch of the original developers of Kafka to provide additional services around it.


They're one of the main contributors of Kafka. They offer closed-source software directly related to Kafka, Kafka SaaS and support/training.


bizarre -> kafkaesque


Hmmm they really look to have huge losses. I understand a big part of that is because of growth but they have a bit more than a years cash remaining based on current burn - maybe 5 or 6 quarters? This IPO will need to generate a lot of cash to get them to a sustainable place.

Their cloud growth is good but their total growth (~50%) although nice isn’t so big for a company doing 236m in revenue but spending ~400m to do that at 50% growth.

Still I like their product and will keep an eye on this one.


Sounds a lot like Basho! :/


Interestingly they don’t list Apache Pulsar or Splunk as a competitor yet list AWS.

Splunk acquired Streamlio the people bind Pulsar in 2019 so you have to ask with the Confluent IPO what are Splunk doing with Pulsar? It is looking like a missed opportunity at this point.

Pulsar is a direct replacement for Kafka. Pulsar even supports the Kafka wire protocol.


Interesting. Kinda weird that they don't see more growth, they do have a pretty compelling "serverless Kafka" cloud offering for those cases where Kafka shines, but where the ops overhead of running your own is a bit too much, which should cover quite a lot of real-world cases. IIRC last time we checked they weren't able to offer Confluent Cloud under fully GDPR-compliant terms, so we had to pass, but otherwise we were pretty impressed, especially considering the pretty attractive pricing at the time.


I guess they need funding to better publitize their offering. It's almost one-click activation in Google Cloud (I guess in AWS/Azure would be te same).

IMHO their biggest competitors are the clouders themselves (with pubsub, SQS, etc..), MQTT, ZeroMQ, and the open source Kafka (which is surprisingly easy to setup).

For me, the best idea they had was the Kafka schema registry (https://www.confluent.io/product/confluent-platform/data-com...) : it's too easy to inject bytes into kafka, but for using that stuff you need to know the format (and no, dunping a JSON string is not a good subtitute for a schema). Instead of guessing, and pulling your hair when there are changes, they provide a way to insert the data with the correct types and format.

(disclaimer I don't have shares in confluent, nor interest in buying some)


> I guess they need funding to better publitize their offering

That's possible. Or perhaps AWS MSK is eating some of their lunch; I hear it's pretty good.

> open source Kafka (which is surprisingly easy to setup)

I've found the hard part isn't day 1 (you do get there quite easily these days), but day 20 when something breaks in an unexpected way. Like when we shut down a cluster to fix issues with the underlying machines and one broker only kept its config, but not its data directory. When restarting the cluster, the one broker without data was the first to be online, since it didn't have to load tens of gb from disk. That lone broker made itself a cluster of one, discovered all our topics, made itself leader for all partitions of all topics ... and when the other brokers came online and joined the cluster, they discovered the no-data leader and promptly started to sync their state to the leader's by deleting the messages they still held, resulting in three empty brokers, 100% data loss. This was a wayy pre-production system and we were happy to break it in interesting ways, but the ease with which we got it into such a state really made us treat it with a lot of respect ever since (that wasn't the only such occurrence, either). Kafka is a very powerful piece of software and big deal for the use cases it's built for, but it has some very thorny parts, and if you can have someone run it for you, you probably should give that some thought.


The schema registry is pretty nice, but there are alternatives in that space too, such as Karapace from Aiven, or Apicurio from IBM.

(disclaimer: I work for Aiven, we also provide Apache Kafka as a service)


TL:DR - $230 million in losses last year, mostly because of headcount costs to fuel growth, with about a bit over $200 million working capital left. Yeah they're growing quickly, but clearly not fast enough for VC firms to offer better terms than Wall Street underwriters.


This looks much worse than I expected for a company synomous with kafka. They only have 60 customers with 1m+ arr for a product thats used pretty much by every established companies .


IMO Confluent is trying to rush this IPO. I'd be trying to push for an IPO aggressively with everything that has happened related to Elasticsearch.

It showed a lot in the industry that Elastic didn't have nearly that much to offer than what everyone thought once Amazon stepped into the ring and the same is apparent with Confluent.

The same similar situation seems to be emerging with their managed Kafka service. Once you're on AWS there will be incentive to move to their Firehose platforms.

Very few customers actually should be running a Kafka cluster at all and the bulk of those would be better off running Pulsar to reduce their pipeline storage costs significantly.


Isn’t Kinesis the real competition from AWS? It’s been around for a long time and Amazon pushes it pretty hard. The downside being linear cost scaling, meaning that your sweet spot for pricing is when your volumes are too low to support having your own operations team.


Elastic results and stock today say otherwise.


If I recall the announcement happened in January'ish? They're down since that point but the tech market as a whole is down since January.

It'll up to 3 years for the full affect to be felt because of time for new projects, rewrite, end of life on the ES version that was last open source.

Everyone I know when they read that blog post was like, "Alright, moving forward we're not vendor locking ourselves into the license with Elastic."

On top of this most companies were going with Elastic for the enterprise features but those are now becoming free with AWS's version.


What will be the full affects? And will they be positive or negative?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: