Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You are absolutely correct! I started working on a sort of compiler a while back but decided to get the basics down first. The templates and switch(s) are not really the issue but rather going back and forth between C & Python. This is an experiment I did a few months ago: https://x.com/nirw4nna/status/1904114563672354822 as you can see there is a ~20% perf gain just by generating a naive C++ kernel instead of calling 5 separate kernels in the case of softmax.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: