Hacker Newsnew | past | comments | ask | show | jobs | submit | nikisweeting's commentslogin

genuinely curious, doesn't JS's proxy objects and prototype-based MRO have a similar performance impact in theory?

Yeah, I don't see how Python is fundamentally different from JavaScript as far as dynamicism goes. Sure Python has operator overloading, but JavaScript would implement those as regular methods. Pyrhon's init & new aren't any more convoluted than JavaScript's constructors. Python may support multiple inheritance but method and attribute resolution just uses the MRO which is no different than JavaScript's prototype chain.

Urban myths.

Most people that parrot repeat Python dynamism as root cause never used Smalltalk, Self or Common Lisp, or even PyPy for that matter.


ArchiveBox open source does not, but I have set it up for paying clients in the past using TLSNotary. This is actually a very hard problem and is not as simple as saving traffic hashes + original SSL certs (because HTTPS connections use a symmetric key after the initial handshake, the archivist can forge server responses and claim the server sent things that it did not).

There is only 1 reasonable approach that I know of as of today: https://tlsnotary.org/docs/intro, and it still involves trusting a third party with reputation (though it cleverly uses a zk algorithm so that the third party doesn't have to see the cleartext). Anyone claiming to provide "verifyable" web archives is likely lying or overstating it unless they are using TLSNotary or a similar approach. I've seen far to many companies make impossible claims about "signed" or "verified" web archives over the last decade, be very critial any time you see someone claiming that unless they talk explicitly about the "TLS Non-Repudiation Problem" and how they solve it: https://security.stackexchange.com/questions/103645/does-ssl...


If web pages were signed the way emails were, it would authenticate if an archived copy of a web page is indeed authentic, but good luck getting such a major change all the way across the entire web. Why would anyone who would gladly retract / redact information on a whim even subscribe to this technology? Would be nice if they all did though.


Have you see this before? https://web.dev/articles/signed-exchanges

Its not for authenticity but instead to prevent tampering of adblock,ad and tracking removals more if there is history in search many complains in this


I've been mulling over how to take ArchiveBox in this direction for years, but it's a really hard problem to tackle because of privacy. https://docs.sweeting.me/s/cookie-dilemma

Most content is going behind logins these days, and if you include the PII of the person doing the archiving in the archives then it's A. really easy for providers to block that account B. potentially dangerous to dox the person doing the archiving. The problem is removing PII from logged in sites is that it's not as simple as stripping some EXIF data, the html and JS is littered with secret tokens, usernames, user-specific notifications, etc. that would reveal the ID of the archivist and cant be removed without breaking page behavior on replay.

My latest progress is that it might be possible to anonymize logged in snapshots by using the intersection of two different logged-in snapshots, making them easier to share over a distributed system like Bittorrent or IPFS without doxxing the archivist.

More here: https://github.com/pirate/html-private-set-intersection


btop is good, I like 'glances' the best though because like 'atop' it actually highlights whatever problem is most likely to be causing lag at the moment, and it breaks out docker containers into a separate section and labels them properly.

I have a few more listed + notes on them here: https://docs.sweeting.me/s/system-monitoring-tools#All-in-on...


my personal theory is that archive.is has paid subscription accounts (legit or via botnet) to most of the major news outlets and edits the html to make the sites look not logged in. I wonder if they do it by hand or by doing something like : https://github.com/pirate/html-private-set-intersection


More than a theory -- they've talked about this on their blog[1] before.

[1] https://blog.archive.today/


in my experience it's just a headless browser with a bypass-paywalls extension


It is definitely more than that for some sites and it has to be manually managed. For example this year i've seen archive.is capture paid articles of some finnish newspapers and the layout gives away that it is logged in on an account although the identifying details have been stripped out.

There have been periods of weeks/months when they don't have paid access to those Finnish sites. Tried it just now on a hs.fi paid article from today and it didn't work, but for example paid articles from just a week ago seem to have been captured as a premium user.

It is curious how they have time to do it and I wonder if news sites of other smaller languages get similar treatment.


Create vpn on a GCP ip address, use googlebot user agent, paywalls gone


Probably works against a fair few sites, but not if they are using RDNS.


Security and auditability is not the core problem, it's versioning and uninstalling. https://docs.sweeting.me/s/against-curl-sh


Also file conflicts. Installing an RPM/ALPM/APK should warn you before it clobbers existing files. But for a one-off install script, all it takes is a missing environment variable or an extra space (`mv /etc/$INSTAALCONF /tmp`, `chown -R root /$MY_DATA_PATFH`), and suddenly you can't log on.

Of course unpredictability itself is also a security problem. I'm not even supposed to run partial updates that at least come from the same repository. I ain't gonna shovel random shell scripts into the mix and hope for the best.


Uninstalling can be a problem.

Versioning OTOH is often more problematic with distro package managers that can't support multiple versions of the same package.

Also inability to do user install is a big problem with distro managers.


Heh BrowserBase and Browser-Use exist specifically because this is a harder problem than it looks. Any approach will work for the first couple actions, that hard parts are long strings of actions that depend on the results of previous actions, compressing the context and knowing what to send, and having your tools work across all the edge cases (e.g. date picker fields, file upload fields, cross origin iframes, etc.).


ULID is the best balance imo, it's more compact, can be double clicked to select, and case-insensitive so it can be saved on macOS filesystems without conflicts.

Now someone should make a UUIDv7 -> ULID adapter lib that 1:1 translates UUIDv7 <-> ULID preserving all the timestamp resolution and randomness bits so we can use the db-level UUIDv7 support to store ULIDs.


A uuid is a 128b number with a specific structure. You can encode them in base32 if you want, there is no need for any sort of conversion scheme.


You need to convert it to perserve the timestamp info correctly so that a ULID library reading the base32 format would reproduce the same timestamp.


What I'm saying is that ULID is irrelevant and unnecessary, if you want "double clicked to select, and case-insensitive" you just encode your UUIDs in base32. They're still UUIDs.


Nah ULID guarantees some extra stuff as part of its spec beyond UUIDv7, including true sub-millisecond monotonicity and a ceiling of 7ZZZZZZZZZZZZZZZZZZZZZZZZZ. You can convert to base32 to get some of the benefits but they're not exactly the same.


I must have a very different brain shape than the author. Color processing is for me subconcious, I don't get the "color overload" situation at all because my brain has hardware accelerated it long ago, there is no concious load to track additional colors or pick out differences. The only time I experience that is when looking at someone else's color scheme when pairing.

It lost me after this part:

> Here’s a quick test. Try to find the function definition here:

I found them instantly with more color, and struggled with less, and found the same for all the subsequent examples as well.


> Color processing is for me subconcious, I don't get the "color overload"

Same. I noticed this on a Whatsapp group I have with old friends. About 8 people, so each gets a specific color in their name, and that's how I identify who is talking on a glance.

Once one of them switched thier phone number, and had to leave and reenter the group. This caused the colors of everyone to change.

For a couple of weeks it was hell. I was used to Person A to be pink, Person B to be yellow and so forth. I would reply to A thinking I was replying to B because their color changed, which caused a lot of confusion.


It just seems like some people prefer highlighting in the literal traditional sense while others like the more common color coordinated code.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: