Hacker Newsnew | past | comments | ask | show | jobs | submit | tbruckner's commentslogin

A simple cue like asking the model to 'see' or 'hear' can push a purely text-trained language model towards the representations of purely image-trained or purely-audio trained encoders.


Has anyone found these deep research tools useful? In my experience, they generate really bland reports don't go much further than summarization of what a search engine would return.


The reports are definitely bland, but I find them very helpful for discovering sources. For example, if I'm trying to ask an academic question like "has X been done before," sending something to scour the internet and find me examples to dig into is really helpful - especially since LLMs have some base knowledge which can help with finding the right search terms. It's not doing all the thinking, but those kind of broad overviews are quite helpful, especially since they can just run in the background.


I caught myself that most of my LLM usage is like this:

ask a loaded, "filter question" I more or less know the answer for, and mostly skip the prose and get to the links to its sources.


The "loaded question" approach works for getting MUCH better pro/con lists, too, in general, across all LLMs.


I do that too, I wonder how much of it is the LLM being helpful and how much of it is the RAG algorithm somehow providing better references to the LLM than a google search can?


My experience is the same as yours. It feels to me (similar to most LLM writing) like they write for someone who’s not going to read it or use it but is going to glance at it and judge the quality that way and assume it’s good.

Not to different from a lot of consulting reports, in fact, and pretty much of no value if if you’re actually trying to learn something.

Edit to add: even the name “deep research” to me feels like something defined to appeal to people who have never actually done or consumed research, sort of like the whole “phd level” thing.


"they write for someone who’s not going to read it" is a great way to phrase it.


I run a small website and am based in the UK and have used it a couple of times to summarise what I need to do to comply with different bits of legislation e.g. Online Safety Act. What's really useful for me is that I can feed in a load of context about what the site does and get a response that's very tailored to what's relevant for me, and generate template paperwork that I can then fill out to improve my position with regard to the legislation.

For sure it's probably missing stuff that a well payed lawyer would catch, but for a project with zero budget it's a massive step up over spending hours reading through search results and trying to cobble something together myself.


The hidden cost there is that the risk of complying with the legislation remained entirely with you. Even the best specialist research LLM still might easily have hallucinated or made some other sort of error which resulted in it giving you confusing or incorrect advice - and you would have been the one held liable for following it.

Whereas with real legal advice, your lawyer will carry Professional Indemnity Insurance which will cover any costs incurred if they make a mistake when advising you.

As you say, it's a reasonable trade-off for you to have made when the alternative was sifting through the legislation in your own spare time. But it's not actually worth very much, and you might just as well have used a general model to carry out the same task and the outcome would likely have been much the same.

So it's not particularly clear that the benefits of these niche-specific models or specialised fine-tunes are worth the additional costs.

(Caveat: things might change in the future, especially if advancements in the general models really are beginning to plateau.)


"Summarization of what a search engine would return" is good enough for many of my purposes though. Good for breaking into new grounds, finding unknown unknowns, brainstorming etc.


I have a script that searches DDG (free), scrapes top 5 results, shoves them into an LLM, and answers your question.

I wrote it back when AI web search was a paid feature and I wanted access to it.

At the time Auto-GPT was popular and using the LLM itself to slowly and unreliably do the research.

So I realized a Python program would be way faster and it would actually be deterministic in terms of doing what you expect.

This experience sort of shaped my attitude about agentic stuff, where it looks like we are still relying too heavily on the LLM and neglecting to mechanize things that could just work perfectly every time.


If you think these things are just using a "dumb" search query, and using the top 5 hits, you're in for a lot of surprises very soon.


Well, considering TFA, it would be pretty strange if I did!

My point was it's silly to rely on a slow, expensive, unreliable system to do things you can do quickly and reliably with ten lines of Python.

I saw this in the Auto-GPT days. They tried to make GPT-4 (the non-agentic one with the 8k context window) use tool calls to do a bunch of tasks. And it kept getting confused and forgetting to do stuff.

Whereas if you just had

for page in pages: summarize(page)

it works 100% of the time, can be parallelized etc.

And of course the best part is that the LLM itself can write that code, i.e. it already has the power to make up for its own weaknesses, and make (parts of itself) run deterministically.

---

On that note, do you know more about the environment they ran this thing in? I got API access (it's free on OpenRouter), but I'm not sure what to plug this into. OpenRouter provides a search tool, but the paper mentions intelligent context compression and all sorts of things.


I tend to use them when I'm looking to buy something of category X, and want to get a market overview. I can then still dig in and decide whether I consider the sources used trustworthy or not, and before committing money, I'll read some reviews myself, too. Still, it's a speedup for me.


Yes, this is one of my primary use cases for deep research right now. It will become garbage in a few short years once OpenAI starts selling influence / ads. I think they’ve started a bit with doing this but the recommendations still seem mostly “correct”. My prior way of doing this was Googling with site:Reddit.com for real reviews and not SEO spam reviewers.


Same case for me. I find it pretty good at it too. Far from perfect but it usually a pretty darn good start.


I have used Gemini's 2.5 Pro deep research probably about 10 times. I love it. Most recently was reviewing PhD programs in my area then deep diving into faculty research areas.


I use ChatGPTs quite often. I can send it a loaded question and it helps tease out sources and usually at the very least scrapes away some of the nuance. I have used it a lot for finding a list of a type of products too. Taking the top n search results is already pretty useful for me but I find it typically is a little more in depth than that, going down a few rabbit holes of search depending on the topic. It does not eliminate doing your own research but it helps consolidate some of the initial information.

Then I can further interrogate the information returned with a vanilla LLM.


Perplexity’s Research tool has basically replaced Google for me, for any search where I don’t already know the answer or know that it’s available somewhere (like documentation).

I use it dozens of times per day, and typically follow up or ask refining questions within the thread if it’s not giving me what I need.

It typically takes between 10sec and 5 minutes, and mostly replicates my manual process - search, review results, another 1..N search passes, review, etc. Initially it rephrases/refines my query, then builds a plan, and this looks a lot like what I might do manually.


I almost exclusively use Deep Research as inputs to LLMs for deeper domain knowledge including frontier scientific theories etc.


You can copy-paste it into your favorite LLM and ask questions about it. That solves several problems simultaneously.


I haven't used any LLM deep research tools in the past; today, after reading this HN post, I gave Tongyi DeepResearch a try to see how it performs on a simple "research" task (in an area I've working experience in: healthcare and EHR) and I'm satisfied with its response (for the given tasks; I, obviously, can't say anything how it'll performs on other "research" tasks I'll ask it in the future). I think I'll keep using this model for tasks for which I was using other local LLM models before.

Besides I might give other large deep research models a try when needed.


23X revenue multiple is not an expensive valuation for something growing well over 100% YoY, even by public market standards let alone venture capital valuation.


Yah but is that growth really an ARR or 1-2 year revenue considering AI capability growth across players, open weight / source models, and inevitable price wars?


Do you really think there's competition?

I don't see anyone on the even in the 30 on this list: https://apps.apple.com/us/charts/iphone/top-free-apps/36


What's the goal here? What do you plan on doing with this?


We discuss it at the end of this blog (https://eyes.mit.edu/what-if/), but the main future goal would be to discover new types of eyes or vision systems. The results we show on the website generally agree with the conclusions from biologists, but what if we put the animal in an environment not actually feasible in the real world? Like what if we put an animal on a mars-like world, what kind of eyes would it evolve? And if we constrain that animal to only evolve eyes that we can manufacture (like a camera), would it discover a new type of camera or algorithm that's better than what humans have engineered? So here we show an attempt to recreate biological vision, but we're interested in applying it to artificial vision (i.e., computer vision, robotics, etc.).


Maybe I just can't see the big picture (pun++) but why limit the simulations to the eye? Many other senses are simultaneously working for survival. I'm not aware of anything that just uses the eye for survival. https://en.wikipedia.org/wiki/Sense


That's very true and is also an area of interest. The number of possible animals that can be evolved increases substantially with more parameters, so adding more sensors/types of sensors makes the problem even more difficult. Also our animals are simple balls, where most intelligent animals have arms, legs, so on. So modeling more would definitely be interesting, too!


This is news to me. Any good examples of this outside of the above?


Vision language models are blind (192 comments) https://news.ycombinator.com/item?id=40926734


On the other hand if I take pictures of circuits, boards, electronic components, etc GPT4o is pretty reliably able to explain the me the pinouts, the board layouts, reference material in the data sheets, and provide pretty reasonable advice about how to use it (I.e., where to put resistors and why, what pins to use for the component on an esp32, etc). As a beginner in electronics this is fabulously helpful. Its ability to pass vision tests seems like a pretty dumb utility metric when most people judge utility by how useful things are.


And Elon is a guy that knows how to tend to flocks of sheep(le)


...or maybe the education of what we deem as useful work needs to change.

Imagine if digital cameras were watermarking their photos because art classes refused to consider photography as a form of art.


> catastrophically losing money on every call

You're going to have to bring some actual numbers on that speculation.


How do you know it isn't RAG?


Housing prices remaining high because people can't move and people can't buy because of interest rates just shows how stupidly hard it is to build new housing in the US.


Home builders operate on borrowed money so high interest rates push up their costs significantly. Worse still, with rates expected to slowly come down they’ll be selling into a market where house prices are likely lower (in real terms). Neither are really connected to how hard it is to build housing (though yes it’s too hard).


Does anyone have a good explanation of why the Federal Reserve isn’t able to have lower interest rates for supply creation activities? It appears that would be a useful tool that would work in favor of their inflation and employment mandate.


Money is fungible, and people WILL arbitrage equal-risk rate differentials, even if you make it illegal.


Economy is doing "good" so there is no need. And there is still large risk of inflation coming back. So it is better to keep rates higher. There is yet no real need to lower rates. So better to wait and see.


The idea is that it could help lower inflation as shelter is a large portion of core CPI. It’s not about lowering the prime rate, but having a lower rate for creation of supply, which would exclude all services (80% of the economy).


The financing for home builders would also be a bit lower if housing were easier therefore less risky to build.

But agree that higher interest rates hurt builders too.


was having this discussion with my father the other day.

The housing crisis seems to be essentially global. It's getting really very bad over here in Australia, we have young families being priced out of even the rental markets and staying in tents and stuff.

I dont understand how the building industry can possibly be going broke when people are clambering for houses to be built!


It’s the “nobody builds used cars” problem.

The people who have the money want high end big houses to be built, so that’s the supply being added.

It’s not being added fast enough to fee up enough “used” smaller houses.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: