Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
pvarangot
on Oct 30, 2024
|
parent
|
context
|
favorite
| on:
Pushing the frontiers of audio generation
It's because it's probably trained with "professional audio", ads, movies, audiobooks, and not "normal people talking". Like the effect when diffusion was mostly trained with stock photos.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: