• 0 Posts
  • 127 Comments
Joined 2 years ago
cake
Cake day: June 16th, 2023

help-circle
  • Ok, I’ll concede that Chrome makes Google a relatively more popular password manager than I considered, and it tries to steer users toward generated passwords that are credible. Further by being browser integrated, it mitigates some phishing by declining to autofill with the DNS or TLS situation is inconsistent. However I definitely see people discard the suggestions and choose a word and think ‘leet-speak’ makes it hard (“I could never remember that, I need to pick something I remember”). Using it for passwords still means the weak point is human behavior (in selecting the password, in opting not to reuse the password, and in terms of divulging it to phishing attempt).

    If you ascribe to Google password manager being a good solution, it also handles passkeys. That removes the ‘human can divulge the fundamental secret that can be reused’ while taking full advantage of the password manager convenience.


  • Password managers are a workaround, and broadly speaking the general system is still weak because password managers have relatively low adoption and plenty of people are walking around with poorly managed credentials. Also doesn’t do anything to mitigate a phishing attack, should the user get fooled they will leak a password they care about.

    2FA is broad, but I’m wagering you specifically mean TOTP, numbers that change based on a shared secret. Problems there are: -Transcribing the code is a pain -Password managers mitigate that, but the most commonly ‘default’ password managers (e.g. built into the browser) do nothing for them -Still susceptible to phishing, albeit on a shorter time scale

    Pub/priv key based tech is the right approach, but passkey does wrap it up with some obnoxious stuff.


  • Passkeys are a technology that were surpassed 10 years before their introduction

    Question is by what? I could see an argument that it is an overcomplication of some ill-defined application of x509 certificates or ssh user keys, but roughly they all are comparable fundamental technologies.

    The biggest gripe to me is that they are too fussy about when they are allowed and how they are stored rather than leaving it up to the user. You want to use a passkey to a site that you manually trusted? Tough, not allowed. You want to use against an IP address, even if that IP address has a valid certificate? Tough, not allowed.


  • Broadly speaking, I’d say simulation theory is pretty much more akin to religion than science, since it’s not really testable. We can draw analogies based on what we see in our own works, but ultimately it’s not really evidence based, just ‘hey, it’s funny that things look like simulation artifacts…’

    There’s a couple of ways one may consider it distinct from a typical theology:

    • Generally theology fixates on a “divine” being or beings as superior entities that we may appeal to or somehow guess what they want of us and be rewarded for guessing correctly. Simulation theory would have the higher order beings likely being less elevated in status.
    • One could consider the possibility as shaping our behavior to the extent we come anywhere close to making a lower order universe. Theology doesn’t generally present the possibility that we could serve that role relative to another.

  • But that sounds like disproving a scenario no one claimed to be the case: that everything we perceive is as substantial as we think it is and can be simulated at full scale in real time by our own universe.

    Part of the whole reason people think of simulation theory as worth bothering to contemplate is because they find quantum physics and relativity to be unsatisyingly “weird”. They like to think of how things break down at relativistic velocities and quantum scale as the sorts of ways a simulation would be limited if we tried, so they like to imagine a higher order universe that doesn’t have those pesky “weird” behaviors and we are only stuck with those due to simulation limits within this hypothetical higher order universe.

    Nothing about it is practical, but a lot of these science themed “why” exercises aren’t themselves practical or sciency.


  • With many bearaucracies there’s plenty of practically valueless work going on.

    Because some executive wants to brag about having over a hundred people under them. Because some proceas requires a sort of document be created that hasn’t been used in decades but no one has the time to validate what does or does not matter anymore. Because of a lot of little nonsense reasons where the path of least resistance is to keep plugging away. Because if you are 99 percent sure something is a waste of time and you optimize it, there’s a 1% chance you’ll catch hell for a mistake and almost no chance you get great recognition for the efficiency boost if it pans out.


  • Those metrics aren’t any more trustworthy than their own subjective word anyway. If they wanted to say they took more time then they could delay at their whim anyway. If they said their production costs increased, then again, they could spend the money to fit the narrative. On those particular points objective evidence is so susceptible to being gamed that it isn’t really more valuable than their subjective reporting.

    Numbers of subscribers/views could be a bit more informative, but then people inclined to disbelieve would claim it’s because of any number of other reasons not because of AI slop.


  • Killing in this case sounds like the content is becoming harder and harder to create, which they lay out the subjective case for, but that wouldn’t be exactly something they could use figures to present, since it’s so subjective.

    The one point they might have been able to show with numbers would be the emergence of AI slop ‘infotainment animations’ diluting the audience, but that wasn’t exactly the biggest point of the video and it might be a bit early to be able to demonstrate statistically credible evidence on that one.




  • If, hypothetically, the code had the same efficacy and quality as human code, then it would be much cheaper and faster. Even if it was actually a little bit worse, it still would be amazingly useful.

    My dishwasher sometimes doesn’t fully clean everything, it’s not as strong as a guarantee as doing it myself. I still use it because despite the lower quality wash that requires some spot washing, I still come out ahead.

    Now this was hypothetical, LLM generated code is damn near useless for my usage, despite assumptions it would do a bit more. But if it did generate code that matched the request with comparable risk of bugs compared to doing it myself, I’d absolutely be using it. I suppose with the caveat that I have to consider the code within my ability to actual diagnose problems too…


  • Based on my experience, I’m skeptical someone that seemingly delegates their reasoning to an LLM were really good engineers in the first place.

    Whenever I’ve tried, it’s been so useless that I can’t really develop a reflex, since it would have to actually help for me to get used to just letting it do it’s thing.

    Meanwhile the people who are very bullish who are ostensibly the good engineers that I’ve worked with are the people who became pet engineers of executives and basically have long succeeded by sounding smart to those executives rather than doing anything or even providing concrete technical leadership. They are more like having something akin to Gartner on staff, except without even the data that at least Gartner actually gathers, even as Gartner is a useless entity with respect to actual guidance.


  • They are still bullish on LLM, just to augment rather than displace human suggested development.

    This perspective is quite consistent with the need for a product that manages prompting/context for a human user and helps the human review and integrate the LLM supplied content in a reasonable way.

    If LLM were as useful as some of the fanatics say, you’d just use a generic prompt and it would poop out the finished project. This is by the way the perspective of an executive I talked to not long ago, that he was going to be able to let go of all his “coders” and feed his “insight” directly into a prompt that will do it all for him instead. He is also easily influenced so articles like this can reshape him into a more tenable position, after which he’ll pretend he never thought a generic prompt would be good enough


  • Subjectively speaking, I don’t see it so that good a job of being current or priortizing current over older.

    While RAG is the way to give LLM a shot at staying current, I just didn’t see it doing that good a job with library documentation. Maybe it can do all right with tweaks like additional properties or arguments, but more structural changes to libraries I just don’t see being handled.


  • I have been using it a bit, still can’t decide if it is useful or not though… It can occasionally suggest a blatantly obvious couple of lines of code here and there, but along the way I get inundated with annoying suggestions that are useless and I haven’t gotten used to ignoring them.

    I mostly work with a niche area the LLMs seem broadly clueless about, and prompt driven code is almost always useless except when dealing with a super boilerplate usage of a common library.

    I do know some people that deal with amazingly mundane and common functions and they are amazed that it can pretty much do their jobs, but they never really impressed me before anyway and I wondered how they had a job…



  • Problem with the “benchmarks” is Goodhart’s Law: one a measure becomes a target, it ceases to be a good measurement.

    The AI companies obsession with these tests cause them to maniacly train on them, making then better at those tests, but that doesn’t necessarily map to actual real world usefulness. Occasionally you’ll see a guy that interviews well, but it’s petty useless in general on the job. LLMs are basically those all the time, but at least useful because they are cheap and fast enough to be worth it for super easy bits.


  • The overall interface can, which leads to fun results.

    Prompt for image generation then you have one model doing the text and a different model for image generation. The text pretends is generating an image but has no idea what that would be like and you can make the text and image interaction make no sense, or it will do it all on its own. Have it generate and image and then lie to it about the image it generated and watch it just completely show it has no idea what picture was ever shown, but all the while pretending it does without ever explaining that it’s actually delegating the image. It just lies and says “I” am correcting that for you. Basically talking like an executive at a company, which helps explain why so many executives are true believers.

    A common thing is for the ensemble to recognize mathy stuff and feed it to a math engine, perhaps after LLM techniques to normalize the math.



  • Not a single of the issues I brought up years ago was ever addressed except one.

    That’s the thing about AI in general, it’s really hard to “fix” issues, you maybe can try to train it out and hope for the best, but then you might play whack a mole as the attempt to fine tune to fix one issue might make others crop up. So you pretty much have to decide which problems are the most tolerable and largely accept them. You can apply alternative techniques to maybe catch egregious issues with strategies like a non-AI technique being applied to help stuff the prompt and influence the model to go a certain general direction (if it’s LLM, other AI technologies don’t have this option, but they aren’t the ones getting crazy money right now anyway).

    A traditional QA approach is frustratingly less applicable because you have to more often shrug and say “the attempt to fix it would be very expensive, not guaranteed to actually fix the precise issue, and risks creating even worse issues”.