• grabyourmotherskeys@lemmy.world
    link
    fedilink
    English
    arrow-up
    168
    ·
    2 years ago

    I haven’t read the article because documentation is overhead but I’m guessing the real reason is because the guy who kept saying they needed to add more storage was repeatedly told to calm down and stop overreacting.

    • Dojan@lemmy.world
      link
      fedilink
      English
      arrow-up
      14
      ·
      2 years ago

      Ballast!

      Just plonk a large file in the storage, make it relative to however much is normally used in the span of a work week or so. Then when shit hits the fan, delete the ballast and you’ll suddenly have bought a week to “find” and implement a solution. You’ll be hailed as a hero, rather than be the annoying doomer that just bothers people about technical stuff that’s irrelevant to the here and now.

  • Swiggles@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    48
    ·
    2 years ago

    This happens. Recently we had a problem in production where our database grew by a factor of 10 in just a few minutes due to a replication glitch. Of course it took down the whole application as we ran out of space.

    Some things just happen and all head room and monitoring cannot save you if things go seriously wrong. You cannot prepare for everything in life and IT I guess. It is part of the job.

    • RidcullyTheBrown@lemmy.world
      link
      fedilink
      English
      arrow-up
      14
      arrow-down
      1
      ·
      2 years ago

      Bad things can happen but that’s why you build disaster recovery into the infrastructure. Especially with a compqny as big as Toyota, you can’t have a single point of failure like this. They produce over 13,000 cars per day. This failure cost them close to 300,000,000 dollars just in cars.

      • frododouchebaggins@lemmy.world
        link
        fedilink
        English
        arrow-up
        12
        ·
        2 years ago

        The IT people that want to implement that disaster recovery plan do not make the purchasing decisions. It takes an event like this to get the retards in the C-suite listen to IT staff.

        • GloveNinja@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          2 years ago

          In my experience, the C-Suite dicks will put the hammer down on someone and maybe fire a couple of folks. They’ll demand a summary of what happened and what will be done to stop it from happening again. IT will provide legit options to resolve this long term, but because that comes with a price tag they’ll be told to fix it with “process changes” and the cycle continues.

          If they give IT money that’s less for themselves at EOY for bonuses so it’s a big concern /s

      • Swiggles@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        4
        ·
        2 years ago

        Yea, fair point regarding the single point of failure. I guess it was one of those scenarios that should just never happen.

        I am sure it won’t happen again though.

        As I said it can just happen even though you have redundant systems and everything. Sometimes you don’t think about that one unlikely scenario and boom.

  • R0cket_M00se@lemmy.world
    link
    fedilink
    English
    arrow-up
    15
    ·
    2 years ago

    Was this that full shutdown everyone thought was going to be malware?

    The worst malware of all, unsupervised junior sysadmins.

  • RFBurns@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    2 years ago

    Storage has never been cheaper.

    There’s going to be a seppuku session in somebody’s IT department.