Would you be willing to write an article comparing the results ? Or share the code you used to test? I am super interested in the results of this experiment.
Thanks for your kind words. My code is not really novel, but it is not like the simplistic Markov chain text generators that are found by the ton on the web.
I will further improve my code and publish it when I am satisfied on my Github account.
It started as a Simple Language Model [0] as they differ from ordinary Markov generators by incorporating a crude prompt mechanism and a kind of very basic attention mechanism named history. My SLM uses Partial Matching (PPM).
The one in the link is character-based and is very simple, but mine uses tokens and is 1300 C lines long.
The tokenizer tracks the end of sentences and paragraphs.
I didn't use part-of-a-word algorithms as LLMs do, but it's trivial to incorporate.
Tokens are represented by a number (again as in LLMs), not a character chain.
I use Hash Tables for the Model.
There are several mechanisms used for fallbacks when the next state function fails. One of them uses the prompt. It is not demonstrated here.
Several other implemented mechanisms are not demonstrated here, like model pruning, skip-grams. I am trying to improve this Markov text generator, and some tips in the comments will be of great help.
But my point is not to make an LLM, it's just that LLMs produce good results not because of their supposedly advanced algorithms, but because of two things:
- There is an enormous amount of engineering in LLMs, whereas usually there is nearly none in Markov text generators, so people get the impression that Markov text generators are toys.
- LLMs are possible because they use impressive hardware improvements over the last decades. My text generator only uses 5MB of RAM when running this example! But as commentators told, the size of the model explodes quickly, and this is a point I should improve in my code.
And indeed, LLMs, even small LLMs like NanoGPT are unable to produce results as good as my text generator with only 42KB of training text.
Under that definition, I wonder if anybody is `creative`. We would need to assess how many of what we call `original ideas` are not rehashes of other ideas.
No, it's been discussed to death and back with "everything is a remix" since 2012. Even the combination of previous ideas into a new one is an original thought. Doing it at the right time even more. Look at history, the invention of the bike, the airplane or the telephone. The technology needed for all of them existed for decades it just wasn't combined in the right way.
Glad you brought up "everything is a remix". As an artist, I have learned to remix, but at least put my own spin on it so it's not a complete rip. Then I can live with myself and I'm not some plagiarist.
Greatness comes at a very high cost. But to me greatness is not about money, or economic success. Greatness is a very subjective term.
For me greatness is about mastering my craft. Creating great things. It still comes at a very high cost, because practice is not free, and it takes away time from life, especially family time.
Now, Why? Look at life this way: Humans only care about the experience of life. Experience can be focused on family, money, personal achievements, material things, etc.
What you refer to a `great` experience is not really a `great` experience for another person. So `greatness`, IMHO, refers to the experience you wanna have. My great experience is becoming a master in my craft. That is my image of greatness. To Transcend. I trade time focused on my work, for time doing other things.
Now, there could be a person that has a vision of `greatness` that refers to being the greatest dad ever. That is still greatness. So basically, what I am trying to say is that everyone looks for their version greatness, and that not necessarily matches what you define as greatness.
Sometimes there is a need to refrain from actions that would lead to greatness in one area, in order to devote the effort to greatness in a different area.
IMHO mono-repo vs multi-repo should be decided based on the sources of change for each component in a product. For example, the cloud components of a product usually change at the same pace and for the same reasons, it makes sense to have them in a mono-repo. Even in a microservices approach. Even in the cloud certain components can change at a different pace and for different reasons. For example if you have grpc api that talks to your mobile app and a webapi exposed to your customers.
I believe that components that move at a different pace and change for different reason should not be in the same repo. It is difficult to setup CI/CD for different ways of deployment and specially if they not changing at the same time.
Now, regarding security, it is important to keep different components of a product in different repos, this will give you the flexibility to manage a more restricted set of credentials and reduce the number of people that have access to it.
In the end it involved 3 things: 1) Sources of Change, 2) CI/CD Processes and 3) Security. You can definitely mix and match.
It also depends on your deploy patterns. A microservice architecture embedded in a monorepo gives you false peace of mind that a breaking API change is okay because you’re changing both ends of the contract in the same commit.
But when that commit goes to be deployed and you don’t have atomic/transactional deploys across services, you get downtime between the first service’s deploy finishing and the second’s.