Hacker Newsnew | past | comments | ask | show | jobs | submit | jonas-w's commentslogin

Mind sharing it? I have an AMD card, so its not that much of use for me, but still would be interested


At least I'm not (second place currently), I'm using a single threaded rust program that chatgpt wrote for me. I'm getting 10-15 million hashes per second.

But seletskiy is, as you can see in his nonce which tells us that he uses an RTX4090 and has 18 Giga Hashes per second. I'm really curious how he wrote that program, as I have a Rx 7900 XTX and would love to use it for this competition haha


Hey. It's nothing very fancy. About 150 lines of C/CUDA code with no deps, including args parsing and logging.

The code runs at steady rate of 18.00-18.40 GH/s at cloud GPU. In fact it's not hashes-per-second, but actually messages-per-second checked.

It launches a 64⁶ kernels in a loop, where each launch checks first two bytes of the SHA of a message concatenated with unique 6-byte nonce per kernel + 6-byte incremental nonce for each launch. There is only one block, so SHA algorithm is heavily trimmed. Also, most of the message is hard-coded, so pre-calculated SHA state is used; it's 10 loops less than needed to encode whole block. Since we only 2 bytes of the hash to check, last two loops also unrolled by hand to exclude parts that we wouldn't need. All code also in big-endian since SHA is, so message hardcoded in big-endian as well.

Base64-encoding is pretty time-consuming, so I've optimized it a bit by reordering the alphabet to be in ASCII-ascending order. I've got to the point where single binary-op optimization can earn 100-500 MH/s speed-up, and I don't really know what else here is remaining.

I don't have RTX4090, so instead I just rented 4090 GPU to run code on. It's less than $0.3 per hour.

I've tried GPT-4 to get some directions for optimization, but every single proposal was useless or straight wrong.

I by no means a C/GPU programmer, so probably it can be optimized much more by someone who more knowledgeable of CUDA.

GPU's are ridiculously fast. It freaks me out that I can compute >18,000,000,000 non-trivial function calls per second.

Anyways, if you want to chat, my e-mail is in the profile.


I think we've done something similar in our kernels, because I've likewise struggled to squeeze more than ~18GH/sec from a rented 4090. I think the design of SHA256 makes it hard to avoid many of the loop iterations. It's possible there are some fun GPU specific instructions to parallelize some of the adds or memory accesses to make the kernel go faster.

If you limit your variable portion to a base16 alphabet like A-P, radix encoding would just be `nibble + 0x41`. Sure you're encoding 4x as many values, but with full branch predictability. You'd have to experimentally determine if that performs any better.

You could also do something theoretically clever with bitmasking the 4 nibbles then adding a constant like 0x4141 or 0x00410041 or something (I'm too tired to think this through) and then you can encode twice as many per op assuming 32-bit bitwise ops are similar in speed to 16-bit bitwise ops.

Anyways this has been a cool challenge. You might also enjoy hunting for amulets - short poems whose sha256 hash contains a substring that's all 8: https://news.ycombinator.com/item?id=26960729


SHA256 is designed as such that the maximum amount of data that can be contained within a single block is 440 bits (55 bytes.)

If you carefully organize the nonce at the end and use all 55 bytes, you can pre-hash the first ~20/64 rounds of state and the first several rounds of W generation and just base further iterations off of that static value (this is known as a "midstate optimization.")

> If you limit your variable portion to a base16 alphabet like A-P

The more nonce bits you decide to use, the less you can statically pre-hash.

In FPGA, I am using 64 deep, 8-bit-wide memories to do the alphabet expansion. I am guessing in CUDA you could something similar with `LOP3.LUT`.


Thanks, I sent you an e-mail, you might need to check in spam folder, as gmail doesn't like my mail server.


I've also been spending today building a miner. Started in JS, then JS with worker threads (~2.2MH/s), then c++ with openssl (~3.7MH/s) and now attempting CUDA.

Also have been using ChatGPT to do the code translations.

I'm currently stuck in yak shaving, I cannot compile CUDA programs here yet as on fedora 40 I need an earlier gcc (13.2) which I am now also having to compile from source.


I'm also yak shaving currently, as I'm using NixOS... which means spending more time looking for the correct packages and creating environments.


FWIW I got it working with rust and opencl, most of it is written by chatgpt as I have no clue about opencl. GPU usage is only 50-60% and I get 100MH/s.

With hashcat and opencl I could get 12GH/s but I couldn't find a way to use hashcat for this use case.


I've parked CUDA for now, I don't understand it enough to care to fix the tooling.

Got an optimised C++ version with no deps averaging about ~24 MH/s on an i7-11800H.

I've got 9 zeros; if I get a result that ranks top 10 I think I'll submit and call it a day.


Wow 24MH/s on an i7 with 8 cores sounds really good!

I don't know how I got it working, but I'm now at 3GH/s with my OpenCL implementation. I basically converted 90% of my rust logic to opencl and now my GPU is at 100% usage and I also needed to switch to a tty, as my window manager became unresponsive haha

I'm kind of glad about this HN post, as I had absolutely no clue about how sha256 and opencl worked before this challenge.

Thanks @quirino


I'm glad you had some fun! This experiment went about as well as I could hope!

If anyone's curious, I'm getting 4.5MH/s single-threaded and 12.2MH/s multi-threaded on a slightly old i7 with 4 cores.

It's my own C++ implementation, which I've made about 20% faster than the fastest one I found online (Zig/stdlib, also tried Go/stdlib, C++/cgminer, rust/ring, C++/VanitySearch and Python/stdlib).

I think it might be faster just because I was able to skip some steps, since all inputs are short and of the same length.

I've just finished testing 10^12 inputs. I think I'll stop with 10 zeroes, which is very likely to happen in the next couple of days, according to my calculations. I might revisit it later to learn some GPU programming.


Very cool!

But same, I also was able to skip some steps as I can make some assumptions about the input as its fixed in length and doesn't need padding.

I got lucky and found a hash with 12 zeroes in 60s with my OpenCL implementation, and now I got my second spot back.

GPUs are so crazy


I asked this question also some time ago: https://news.ycombinator.com/item?id=36650593


I don't know if you are being sarcastic, but the DNT header exists, just no one respects it. https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/DN...



One of the former main devs posted this on discord:

"Sooo here is a timeline: 1. WTT (the other core dev) left 2. I decided to also leave (see message in #announcements) 3. I gave the discord server, subreddit and github to QuantumChilliPepper 4. QuantumChilliPepper didnt really do anything, then went on a vacation, came back and went completely m.i.a. for over a month 5. Hackernews talks about eupnea 6. QuantumChilliPepper got compromised (at least eupnea email, github and discord) 7. Hacker deleted a lot of important channels and after messing a bit finally have back the server"


yeah i watched this all go down - it was a nightmare lol


https://api.github.com/users/est

They created their account at "2008-09-07T22:46:14Z"


On desktop I'll stick to the original page, it's so simple and good.

But it isn't that good on mobile. On mobile I use Harmonic for Hackernews it is the best one I've used. I tried many others but Harmonic is the only one I'm still using. The only reason that made me want to use another App, was that it wasn't open source, but it got open sourced some time ago.

https://github.com/SimonHalvdansson/Harmonic-HN


i've been using Glider which is also open source. prior to that, i was using Materialistic. both available (along with a few others) on F-Droid.


You can also access x.com when using the Unicode Character https://xn--971h.com (𝕏.com)


Are you sure? When I type https://xn--971h.com in it doesn't work, but clicking the link in your comment does because HN has made the href x.com. The browser converts 𝕏.com to x.com before going there too. Moreover, X and 𝕏 are treated as equivalent characters in this context, so whois 𝕏.com looks up x.com, and 𝕏.com's punycode encoding is x.com: https://www.whatsmydns.net/idn-punycode-converter?q=%F0%9D%9...

whois xn--971h.com returns the same thing as an unregistered domain.


HN automatically converts these urls to punycode, maybe this is a bug and it automatically converts everything to punycode even though it doesn't make sense?


It's unrelated to HN. Copy pasting 𝕏.com into the address bar still leads to x.com

related: https://en.wikipedia.org/wiki/IDN_homograph_attack


I already mentioned that, and it's not what they mean. Hover your mouse over a https://xn--971h.com link in a comment on HN and notice that it's pointing to https://x.com. This does seem to be an HN thing; see here: https://jsfiddle.net/rtfhejdy/


I think so. Then I'm not right about the punycode encoding of 𝕏.com - it's that it isn't even needed. I've emailed dang about it.


Looks like it's also not possible to register xn--971h.com (I tried both Namecheap and Cloudflare).


Yeah, it's explicitly disallowed by ICANN to register a domain with this unicode character (along with numerous other characters):

https://www.verisign.com/assets/icannrestricted/idn-icann-re...


Also https://xn--u1a.com is for sale (where х is a Cyrillic letter, see [0]).

Would be funny to buy it and make a redirect to threads.net.

[0] https://en.wikipedia.org/wiki/Kha_(Cyrillic)



i thought all non-latin punycode domains were forbidden for purchase on .com?

edit - i looked up the whois, just because it's parked at godaddy doesn't mean it's for sale

Name: XN--U1A.COM

Internationalized Domain Name: х.com

Registry Domain ID: 106236037_DOMAIN_COM-VRSN

Domain Status:

clientDeleteProhibited

clientRenewProhibited

clientTransferProhibited

clientUpdateProhibited


$ curl -i 'https://xn--971h.com'

curl: (6) Could not resolve host: xn--971h.com


    $ curl 𝕏.com
    <html>
    <head><title>400 Bad Request</title></head>
    <body>
    <center><h1>400 Bad Request</h1></center>
    <hr><center>cloudflare</center>
    </body>
    </html>


What's that all about then? I'd expect curl to deal with either/or

curl should especially live with a punycode version.

Unless the punycode from the OP is incorrect? From basic online tools, it looks like it simply converts to x.com


I get:

  $ curl http://𝕏.com -v
  *   Trying 34.102.136.180:80...
  * Connected to 𝕏.com (34.102.136.180) port 80 (#0)
  > GET / HTTP/1.1
  > Host: x.com
  > User-Agent: curl/8.1.2
  > Accept: */*
  >
  < HTTP/1.1 200 OK
No https, though. (OpenSSL SSL_connect: Connection reset by peer in connection to x.com:443)


I wonder what Elon paid for x.com?


I'm not sure. There was a short period when you could register one letter dotcoms before about 1994. He's owned it since forever that I know of. His first company was called X, which became PayPal.

I spent a long time in the mid-90s trying to persuade INTERNIC to let me register b.com.


Shame. That would have made a great Chinese porn site. https://en.wiktionary.org/wiki/%e5%b1%84


I'm pretty certain you can register one-letter IDN .coms. I registered a bunch the day the IDN system went live, and I had no idea what characters I was registering or what they meant and I later let them all lapse.


See about a dozen comments in this thread, PayPal owned it post-acquisition until 2017 when Musk bought it back.


As I mentioned in another comment [0], a domain which looks like X.com but is a Unicode X https://xn--vm8a.com/ can be bought for 9,000,000 USD and I'd think that the real X.com is worth much more, as it doesn't get converted to puny code [1].

[0]: https://news.ycombinator.com/item?id=36844079 [1]: https://wikipedia.org/wiki/Punycode


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: