Otherwise the larger picture is that MCP is a land grab for building an eco-system around integrations to get access to data. Your LLM agent is not valuable if it can't access things for you... and from a market perspective enterprise pays a lot for this stuff already, and yes MCP is not thought out at all for Enterprise really... At least thankfully they added stateless connections to the spec...
Tomato, tomato. If it’s not FOSS, I’m not going to sign off on wasting time on it.
(Yes, of course I use proprietary services where necessary and they can’t be avoided. This isn’t one of those cases. Example of things where I’m pretty adamant about it: server OSes. Databases. Programming languages. Web servers.)
> Licensor grants You a limited, non-exclusive, revocable, non-sublicensable, non-transferable,
For now it's source-available with generous limit, but this can be changed or revoked at any time, and this may immediately make your existing installations illegal.
My understanding is it’s the opposite: that license is the only thing granting you usage rights. In the absence of a contract, or words in the license to the contrary, they could revoke those rights on a whim. It’s not so much that you have a default right to use their proprietary software and they may issue something that revokes it. It’s that you have no right to use their software except their continued good will.
The GPL says:
> All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met.
Source available doesn't allow you to build on that software or patch it the way you see fit.
Heck, even some source available licenses doesn't allow you to compile that thing, let alone get parts and use it elsewhere.
However, I somewhat like source available licenses currently, because they're neat little mines that sneak in to training sets of generative AI models and make the models less suitable for serious work.
How does it make the models less suitable? Wouldn't more high quality source code help improve it? If it was closed source entirely it couldn't be trained on.
If it’s trained on proprietary software and then injects non-Free code into your project, you may have all kinds of unplanned legal exposure. That’s what makes such a model less suitable.
yep, agreed. I think that's another way to view the current state of "GenAI" tooling (e.g. all those complicated frameworks that received $M in funding) and why things like https://www.anthropic.com/research/building-effective-agents fall on deaf ears...
Howdy! Erick from LangChain here. If anyone is seeing version conflicts on particular packages, please let me know!
These usually stem from overly strict constraints in the underlying sdks for the integrations, and in general we've been pretty successful asking for those constraints to be loosened. The main "problem" constraint we've seen in the past has been on httpx. Curious if you've seen others!
The challenge with this approach though is that you need to run the actual code to see what it does, or as a developer build up a mental model of the code ... but it does shine in certain use cases -- and also reminds me of https://github.com/insitro/redun because it takes this approach too.
Forgot to mention, on the roadmap we have pluggable executors, e.g. delegating parallelism to whole agents/subgraphs to systems like "ray". Would love to collaborate with folks if that's interesting.
Looks great. But being burned by other "abstractions", e.g. LangChain, I'm weary of the oversimplification. How are you not going to make those same mistakes?
I don't think so, I see them as complementary. MinIO is great when you have downstream applications which speak the S3 API that need acceleration of that data. Regatta is designed for applications which speak file semantics (think, application logging, storing corpuses of training data, or state) that doesn't run on the S3 API. Regatta actually supports MinIO as an S3-compatible backend for your file system!
I think it’s more analogous to Minio’s discontinued proxy mode. This is where you’d talk to minio locally (using whatever interface/protocol) and it would act as a local cage for S3 objects. If you wrote to it, it would propagate the changes up to S3 proper (or whomever using the S3 protocol).
I believe they stopped supporting that mode because they didn’t want to keep chasing every S3 protocol change. However, if you’re just using S3, and not trying to masquerade as S3, this problem becomes easier.
I think it's complementary as well, even more so after MinIO deprecating its Gateway and Filesystem modes a couple of years ago. MinIO is "S3 compatible" object storage, so technically, MinIO users should be able to use your product to have a file-system like experience on their buckets and objects, although you're using IAM and there might be a need either for your client to handle pure S3 credentials, either for a third-party plugin to your client to do that. It could be a good opportunity to piggyback on MinIO's userbase.
We had built an MLOps platform[0] a few years ago and enabled users to use their S3 buckets in a "file system like" manner. This made it possible for them not to have to know or write S3 specific code in their Jupyter notebooks as most people in the industry did with boto3, which also forced them to write code (say using TensorFlow) in a certain way for training to consume the files, err, objects. It was a mess, and we removed that for notebooks that could run the same way on a laptop or on the platform, even with the shell kernel so people could explore objects like files. MLFlow could work on a filesystem or on S3, but it had no authentication, so we built around that to know which user/experiment produced which artifact.
MinIO had a Gateway that was deprecated. We didn't use it much and they didn't have an admin client at the time, so I rolled one up to orchestrate the thing.
One way I did it that hook into users' compute and storage as opposed to offering storage/compute was for two reasons:
- Organizations already had their data somewhere with established policies. Getting them to move that data is very hard (CISO, CTO, IT, legal, engineers). Friction would have been huge.
- Organizations already had budgeted compute and storage, they may have had contracts/discounts/credits with cloud providers and it didn't make sense to ask them to make a decision on budgeting for another solution.
- A design principle of having the product being able to die without leaving the users scrambling to exfil/migrate data.
One way to do it was to handle FUSE, and your mileage may vary (s3fs-fuse, goofys, etc). Amazon has released Mountpoint last year[1], and one question you'll get asked is why use Regatta when I could use Mountpoint?
We are finding a lot of success in the ML Ops space for exactly this reason. I also completely agree that enterprise customers want to keep their data where they can govern and audit it (often in S3). We're excited about the possibility to allow folks to access and use that data while it stays in S3 for primary storage.
I agree around the questions with Mountpoint, and we're solving a very different set of problems than Mountpoint. Mountpoint, for example, isn't designed to be used with all file applications and lacks support for things like appends to existing files, random writes, renames, and symbolic links. On the other hand, Regatta supports POSIX semantics and can work with nearly all file based applications.
Otherwise the larger picture is that MCP is a land grab for building an eco-system around integrations to get access to data. Your LLM agent is not valuable if it can't access things for you... and from a market perspective enterprise pays a lot for this stuff already, and yes MCP is not thought out at all for Enterprise really... At least thankfully they added stateless connections to the spec...