More

Teodolfo · 2026-02-27T00:43:27 1772153007

If these values really meant anything, then Anthropic should stop working with Palantir entirely given their work with ICE, domestic surveilance, and other objectionable activities.

Teodolfo · 2025-11-07T19:49:49 1762544989

PyTorch was partly inspired by the python Autograd library (circa 2015 [1]) to the point where they called their autodiff [2] system "autograd" [3]. Jax is the direct successor of the Autograd library and several of the Autograd developers work on Jax to this day. Of course, for that matter, PyTorch author Adam Paszke is currently on the JAX team and seems to work on JAX and Dex these days.

[1] https://pypi.org/project/autograd/#history

[2] https://www.cs.toronto.edu/~rgrosse/courses/csc421_2019/read...

[2] https://web.archive.org/web/20170422051747/http://pytorch.or...

cs702 · 2025-11-07T20:03:32 1762545812

Yes, PyTorch borrowed from Autograd, Chainer, etc.

...but PyTorch felt friendlier and more Pythonic, and it came with a comprehensive library of prebuilt components for deep learning in `torch.nn`.

See https://news.ycombinator.com/item?id=45848768

Teodolfo · on Dec 11, 2024

Direct indexing is WAY more valuable to US citizens living in the EU than US citizens living in the USA because of the painful intersection of MiFID II rules and US tax law (PFIC tax cancer makes buying EU domiciled funds a non-starter). Brokers will not sell US domiciled ETFs to US citizens living in the EU unless they can opt out of the consumer disclosure rules (e.g. by becoming an elective professional client of their broker under MiFID II rules). So these 2 million US expats have no choice but to manage a portfolio of individual stocks or pay exorbitant AUM fees. The first half-way decent direct indexing product that accepts US expats residing in the EU will make a killing!

Teodolfo · on June 28, 2023

Igor Markov, along with Sat Chatterjee, seem to be pursuing a bizarre vendetta (after Sat failed to take over their project) against the lead authors of the chip placement work, not some sort of intellectually honest critique.

This was covered previously in the press and on social media, with statements from a variety of prominent researchers (e.g. [1][2][3]).

The code is even available for the Nature paper's method, along with an FAQ: https://github.com/google-research/circuit_training#FAQ

[1] https://twitter.com/ZoubinGhahrama1/status/15122035096467415...

[2] https://twitter.com/JacobSteinhardt/status/15215993404137881...

[3] https://twitter.com/sguada/status/1521587406385807361

sashank_1509 · on June 28, 2023

Why not try reading the paper written by Igor and try to find single instance where he launches a personal attack on the researchers, calls them names etc?

Note just the tone difference as you read Igor’s work to the stuff of his detractors. One immediately goes personal, tries to figure out motivations of the opposing counsel, talks about harassment and sounds emotional to say the least. The other has an extremely objective tone, only focuses on the subject matter, and in general reads more like a maths theorem than an activist essay.

I’ll leave you to guess who sounds like who.

oldgradstudent · on June 28, 2023

One is basically an evidence-free ad hominem attack.

> [2] https://twitter.com/JacobSteinhardt/status/15215993404137881...

The other two sources make a concrete claim that in mid-2002 there was an independent, open-source, replication of the Nature paper:

> [1] https://twitter.com/ZoubinGhahrama1/status/15122035096467415...

>> Google stands by this work published in Nature on ML for Chip Design, which has been independently replicated, open-sourced, and used in production at Google.

> [3] https://twitter.com/sguada/status/1521587406385807361

>> The results in the Nature paper were independently replicated and validated by my team, the results were used in actual chips and Sat and his collaborators know it.

>> Furthermore, the code was open-sourced.

>> It is sad that you are providing a platform for someone's resentments.

The claims about independent replication refer to Google's circuit_training repository[1]. The UCSD team has conclusively shown this claim was materially false (see section 3 of their paper[2]).

BTW, Prof. Andrew Khang, who headed the UCSD effort, initially wrote an exteremely favorable editorial about the Nature paper[3].

[1] https://github.com/google-research/circuit_training

[2] https://arxiv.org/pdf/2302.11014.pdf

[3] https://www.nature.com/articles/d41586-021-01515-9

true_north · on June 28, 2023

The matter is way past superficial personal accusations. And the people at these Twitter links have no technical background in chip design (why would anyone listen to them?). Sergio's and Zoubin's tweets are obviously and verifyably wrong.

Nature confirmed to reporters that they are investigating the paper. https://www.theregister.com/2023/03/27/google_ai_chip_paper_...

Teodolfo · on April 15, 2023

"The AlexNet paper is generally recognized as the paper that sparked the field of Deep Learning"

Uh not really. That would have been "A fast learning algorithm for deep belief nets" in 2006.

Also weird how this list completely ignores speech recognition. Deep learning's success in speech recognition predates AlexNet and motivated Google to create TPUs [1], and, more generally, invest in deep learning.

[1] https://www.wired.com/video/watch/the-story-behind-google-s-...

pmoriarty · on April 15, 2023

Shouldn't this argument be easy to settle with a simple citation search?

Step 1 - Rank papers by number of citations

Step 2 - Which of the most cited papers is the earliest?

kgwgk · on April 15, 2023

One paper will be the most cited. One paper will be the earliest.

Unless they are the same there is not _one_ answer, what you get is an efficient set (the earliest paper with at least X citations / the most cited paper as of D).

And of course there is the problem of making the list of relevant papers in the first place.

hardha87 · on April 15, 2023

You are mistaken. Ladleys papers detailing the explorative method and his figurative routines adopted by “A fast learning algorithm for deep belief nets” was the firecracker that started deep learning.

Also anyone who references wired.com should be shown the door

mdp2021 · on April 15, 2023

> anyone who references [Source] should be shown the door

Well - (exportable input) -, we are past the age of editorial boards (but for The Economist, probably) and in the age of independent journalists lending efforts inconsistently to different publishers.

IanCal · on April 15, 2023

Honest question, which paper is this? The 2012 one? The 2006 Hinton paper was the trigger surely - that's what I used and was filled with things that worked very quickly on GPUs.

Teodolfo · on Jan 18, 2023

Going from 25% top-one error to 22% top-one error is a massive jump on ImageNet and very meaningful in a lot of applications. That said, there is no reason to believe attention-based models are the only way forward on image classification. The humble ResNet-50 can get near 22% top-1 error when trained properly.

Whether we need attention or not is a more interesting question on seq2seq models on text data.

amelius · on Jan 18, 2023

What top-1 error do humans get on this problem?

Teodolfo · on Sept 12, 2022

Julia provides no meaningful advantages. PyTorch and JAX are too good. For typical deep learning workloads, Julia will not easily have a speed advantage. Everything goes down to the same cuDNN kernels anyway.

Julia seems like an attempt at a better matlab, but the machine learning world moved to python first.

(Also 1-based indexing is almost as obnoxious as the 24/7 Julia shill brigade.)

sidkshatriya · on Sept 12, 2022

> (Also 1-based indexing is almost as obnoxious as the 24/7 Julia shill brigade.)

You had me at 1-based indexing. This is sadly an anti-feature of Julia as far as I'm concerned.

I know that 1-based indexing is used in Fortran, MATLAB and Lua (and other places) -- but I just find 0-based indexing more natural to me. If I grew up using 1-based indexing I'd probably be saying the opposite but this is what I prefer now.

Julia is probably a more cleaner system overall. But then so is Deno compared to Node. The problem is that the alternative needs to be many times better not merely better. The "network" effort is quite strong in Python scientific computing this is still outweighing some of the real benefits of Julia.

It is always good to have alternatives. The quality of GCC improved when Clang came along. Julia's presence will keep Python on its toes. Both platforms will keep improving.

But, once again, Julia's 1-based indexing is something that adds unnecessary friction, for me at least.

mickeyp · on Sept 12, 2022

What I miss about Pascal is its offset indices. You could start an array at 0, 1, 42, or what have you. Useful in rare circumstances!

oivey · on Sept 12, 2022

Julia has this. It’s called OffsetArrays.jl. Ironically, it’s a common source of bugs because library authors don’t always anticipate them and loop over 1:length(A) rather than eachindex(A).

Teodolfo · on Aug 3, 2022

In some unit systems (https://en.wikipedia.org/wiki/Geometrized_unit_system) they are both units of length.

guipsp · on Aug 4, 2022

They are not both units of length on that system either.

Teodolfo · on Feb 19, 2020

Do you happen to have links to the papers?

Edit: Looks like https://www.sciencedirect.com/science/article/abs/pii/S17505... and https://www.sciencedirect.com/science/article/abs/pii/S17505...

Teodolfo · on Nov 26, 2018

There are plenty of free, open access journals that are reputable.

JMLR http://www.jmlr.org/ is quite successful. There are some fields, such as machine learning, that are not dominated by for-profit journals. Why is this possible in some fields and not others? My answer would be that it is possible in all fields, but incumbency advantages can be very strong and coordination across academic volunteers can be more difficult when they are individually less secure.