Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Programming AI Technology

Computer Program Fixes Old Code Faster Than Expert Engineers 167

An anonymous reader writes: Less than two weeks after one group of MIT researchers unveiled a system capable of repairing software bugs automatically, a different group has demonstrated another system called Helium, which "revamps and fine-tunes code without ever needing the original source, in a matter of hours or even minutes." The process works like this: "The team started with a simple building block of programming that's nevertheless extremely difficult to analyze: binary code that has been stripped of debug symbols, which represents the only piece of code that is available for proprietary software such as Photoshop. ... With Helium, the researchers are able to lift these kernels from a stripped binary and restructure them as high-level representations that are readable in Halide, a CSAIL-designed programming language geared towards image-processing. ... From there, the Helium system then replaces the original bit-rotted components with the re-optimized ones. The net result: Helium can improve the performance of certain Photoshop filters by 75 percent, and the performance of less optimized programs such as Microsoft Windows' IrfanView by 400 to 500 percent." Their full academic paper (PDF) is available online.
This discussion has been archived. No new comments can be posted.

Computer Program Fixes Old Code Faster Than Expert Engineers

Comments Filter:
  • Does that mean the program responds in negative time? even before you click the button it's already done what you wanted? Sounds impractical...

    • by Bengie ( 1121981 ) on Friday July 10, 2015 @09:56AM (#50081373)
      100% faster just means 2x.
    • by narcc ( 412956 )

      Here you go [studyzone.org]

      • by randm.ca ( 901207 ) on Friday July 10, 2015 @10:17AM (#50081543) Homepage
        Am I having an English fail, or are they having a Math fail?

        Mr. Smith's salary has increased by 350% over the past 20 years. If his original salary was $22,000 per year, what is his current salary?
        First change 350% to a decimal: 350% = 3.5
        Next, multiply the original salary by the decimal number: (22000) x (3.5) = 77000
        After 20 years Mr. Smith's salary has increased from $22,000 to $77,000 per year.

        Didn't his salary increase by $77,000, from $22,000 to $99,000?

        To simplify it, if his salary had increased by 100% would that mean he was now making $22,000 (their logic) or $44,000 (my logic)?

        • You seem to be correct. Instead of saying his salary increased by 350%, they should have said his salary increased to 350% of his original salary.
        • by Anonymous Coward
          As you point out, you get to keep the original value (%100 or "1") and ADD the % increase. In other words, you have %100 percent + the increase.

          Example 1: if sales tax is 10%, or 0.1, and and item is a dollar, you have 10 cents in tax + the dollar, for 1.10 to get out of the store.

          Example 2: A 20% tip on a $10 tab is 1.2 x 10, or $12.

          E.g. always add the "1" for the original you are keeping. For the 350% example, you would have 1+3.5 as the multiplier or 4.5 x $22,000 = $99,000.
        • by alexhs ( 877055 )

          Their obvious math fail.

          Mr. Smith's salary has increased by $77 000 (notice how that's exactly how it's worded: by 350%) , not "to" $77 000.

          If their answer was to be believed, the original statement should have been that his current salary is 350% of what was his salary 20 years ago.
          One could also say that his salary has increased by $3.5 (% = 1/100 = 0.01, right ?) .

          I suspect that % is the more irregular symbol of mathematics. Most people don't know how to correctly use it in most contexts.

          Then you could h

    • How are you getting that idea? What it's most likely saying is that if a process took, say, 5 seconds before it could take as little as 1 second now. It has nothing to do with "negative" time.
      • But that's 500% of the new value, not the old one. In order to express performance gains in a way that makes sense, you need to express it in terms of the old value.
        • A program that could process 100 MB of data in five seconds can now process 500 MB of data in five seconds. That's a 400% increase in the amount of data it can process per unit of time. People are just really bad at wording this sort of thing for some reason. I think that reason is probably that math and language are taught largely independently and as if one never interacts with the other.

        • Technically, speed or "fast" is measured as quantity / time. If 5.0e9 operations used to take 5.0 seconds and now takes 1.0 second, it's 5 times as fast, just as 50 mph is 5 times as fast as 10 mph.
    • Is that even a thing? How do bits rot?
      • How do bits rot?

        From the head down

      • Is that even a thing? How do bits rot?

        it's only the ones that don't get used as frequently. the zeros are ok

      • Re:Bit-rotted code (Score:5, Informative)

        by swillden ( 191260 ) <shawn-ds@willden.org> on Friday July 10, 2015 @11:56AM (#50082335) Journal

        Is that even a thing? How do bits rot?

        Yes, it's a thing. An important one, well-known to all software developers with a few years of experience.

        Barring some sort of hardware failure the bits themselves don't change, of course. What changes is the context in which the bits are used. The article seems to be focusing mostly on performance, and that's one area of bit rot, a minor one. As hardware changes, the best performance optimization methods change, but the bits don't. Updating the code can improve performance. More important forms of bit rot have to with supporting libraries, or even business context. Causes and effects vary, but at bottom some assumption made by the old code is no longer true. If the assumption was never true, or for some reason the developer should not have expected it to be true, we call this a bug. But sometimes the code did make perfect sense as it was... but no longer. It's a bug now, but was correct previously.

        That is bit rot.

        Wikipedia has a fairly decent article if you want to read more: https://en.wikipedia.org/wiki/... [wikipedia.org]

      • In binary software "rot" is an abbreviation for "rotate", if no further specified, then it is to the left.

        Like "LROT".

        Of course it could be to the right: RROT (a heavier version of mere ROT)

        A ROT means ... a well ... most people know and I'm to tired to explain it to the noobs.

    • YOU ARE SO SMART!

      It's obviously physically impossible to ever improve the performance of anything more than 100%. That's why 60+ years of Moore's law means that modern computers have a maximum theoretical performance ceiling of 769 FLOPS since the Eniac churned out 385 FLOPS and it's physically impossible to be more than 100% faster.

      Where can I sign you up for the Turing Prize?

      http://knology.net/johnfjarvis... [knology.net]

      • That's comparing the performance of different hardware. What's being measured here is the performance of software on the same hardware.
    • I'm a little confused but it seems to me that whether or not you can improve performance depends on how you measure performance. Cutting a program's run time from 5 seconds to 2.5 seconds measured one way is a speed increase of 50%, One of the replies measured something in terms of GFLOPS, but that's a measurement of hardware performance, not software performance on the same machine. I'm not sure what metric would be used to cause software to have a greater than 100% on the same machine.
  • So, hopefully you are doing this on stuff so old you don't have to worry about your support contract, hopefully you can figure out how to legally do this without violating someone's copyright (it's a derivative work), and hopefully you don't introduce any new errors or security holes.

    This sounds cool, but you're in a gray area in terms of legality I should think.

    I guess if it works it's cool, I'm just not clear on how you can prove that the two are 100% the same.

    • by Bengie ( 1121981 )
      Better outlaw bit rot, it changes binaries all the time. Non-checksumming filesystems are now illegal.
    • Copyright is only violated if you distribute the work ... everyone in the business should know that.

      • by mark-t ( 151149 ) <markt AT nerdflat DOT com> on Friday July 10, 2015 @11:04AM (#50081901) Journal

        Actually, copyright is violated when you copy the work without permission and the purpose for copying the work does not otherwise make it exempt from infringement. If you don't distribute at all, and you do not profit from it in any way, then nobody else is necessarily even going to know you copied it, let alone want to prosecute you for it, but that does not necessarily mean it is necessarily legal (some jurisdictions even have private copying as exempt from infringement anyways, or for example, in Canada, have private-use copying exempt only so long as a mechanism for compensating at least some percentage private-use copies exists, such as levies on blank media where such copied content is considered by said controlling agencies to be likely to be copied onto).

        Distribution of unauthorized copies is usually taken as gold-bar standard proof of infringement, since it establishes, in an objectively verifiable way, a non-private-use purpose for making those copies which can be presented by the copyright holder as a basis for the claim against the infringer. The action of distributing unauthorized copies is generally considered illegal as well, a crime related to infringement and no less than at least evidence of conspiracy to commit copyright infringement, but is not technically copyright infringement, not is it absolutely required to commit infringement... at most it may only be required to be held accountable for infringement.

        • Actually, copyright is violated when you copy the work without permission and the purpose for copying the work does not otherwise make it exempt from infringement.
          That is ... actually ... wrong.

          All copyright laws on the world are about distributing/selling copies. You can copy what ever you want for personal use. Was always like that.

          There are only new exceptions like DMA etc. which want to prevent you from copying a CD etc. to a different media.

          The action of distributing unauthorized copies is generally

          • by mark-t ( 151149 )

            You are mistaken.

            Let's take US law for example.

            Title 17, Chapter 5, section 106 outlines the exclusive rights of a copyright holder, one of which is to reproduce the copyrighted work. Sections 107 through 122 outline limitations on those rights.

            title 17, chapter 5, section 501 defines copyright infringement. The mere act of copying without permission, which encroaches on the exclusive rights described in section 106, is infringement. Copying for personal use is an argument for fair use, which is a

            • As I mentioned before the "crime" part is new and requires criminal behaviour (and is an US specific thing), modifying software you bought and is running on your machines only, is neither a violation nor a crime.

              • by mark-t ( 151149 )
                Copyright infringement is a crime in several other countries as well, actually, I cited the relevant sections for US law o ly because I knew exactly where to look for them. I believe similar statutes exist in a least several European countries, and I know as a positive fact that is also how it is in Canada. But what part about "is" implies that is the way it has always been?
                • Copyright does not really exist in Europe, we call it 'Urheber Recht' in german or 'moral rights'.
                  A crime is, since a few years, to distribute pirated music and movies.
                  Copying software, or books or any other thing falling under 'copyright' is only prosecuted on request of the rights holder, who has to prove the infringement, and it is not a crime.

  • by Anonymous Coward

    I always understood bit-rot to be data degrading on a storage medium. Not sub-optimal cruft code from way back.

    • by Junta ( 36770 )

      Incidentally, it's (probably) not in this case that the code was sub-optimal, just mis-optimized. Whether it's something like code written before AVX2 existed, now adding AVX2 codepath, or scenarios with two algorithms that end the same way that use different operations where a choice was made based on chips that could do one of those faster, and newer chips started getting faster the other way.

  • by account_deleted ( 4530225 ) on Friday July 10, 2015 @09:53AM (#50081353)
    Comment removed based on user account deletion
    • by Anonymous Coward

      I'd mod you +1 BOFH

    • by Anonymous Coward

      I wrote the advanced version of that program called format. Optimizes every program on the drive you select. When I use it the whole department comes in my office furious, likely because they know this program will replace them all.

    • There are not enough +1s for this.

    • del *.* is still a valid way to clear space on my older PC. A little love for DOS please, the machine is Dos 3.1 and runs fine thank you.

      I too admire the elegant efficiency you show above.

    • Re: (Score:2, Redundant)

      by kheldan ( 1460303 )
      You oldschool guys need to update your game. There's been a newer version of your optimizer out for years now; it's called "Delete system32", and it's superior in every way to your old, busted-down version. Unfortunately it's for Windows only; Anonymous Labs hasn't gotten around to porting it to other operating systems.
  • by Anonymous Coward

    From the paper (final section)

    To rejuvenate these programs, we need high-level,easily-optimizable representations of their algorithms.

    Well shit, if only we had that.

  • Comment removed (Score:5, Interesting)

    by account_deleted ( 4530225 ) on Friday July 10, 2015 @10:02AM (#50081421)
    Comment removed based on user account deletion
    • PS: is it because of the summer-period lack of news or Slashdot is planning to make the move to pure sensationalism? Because a big proportion of the titles lately are quite unclear (to say it softly).

      I think its a move towards sensationalism. Given that /. is now mirroring the article titles on social media (twitter, facebook?) it is looking more and more as if /. editors have been told to make the titles vague yet provocative.

      Keep in mind, as the page footer says, Slashdot is a DHI Service [dhigroupinc.com].

    • The comment about the article is all wrong. He talks about fixing bugs. No bugs are being fixed. Instead, the code looks for opportunities to optimize code to new hardware the original code wasn't designed to work on.

      It got my attention because I didn't believe someone could have written code to solve bugs as they are very abstract to be systematically fixed (generally speaking).

  • The article says the problem to address is "bit-rot". So, they only had one copy of the program which degraded due to failure of the storage media and they have to patch it? No. From the context that does not seem to be the case, although that is the only meaning of "bit-rot" that I know.
    In any case the article seems to continue with the problem that old code optimized for old hardware has to be patched now and then to improve performance. Helium seems to be a tool for that, working directly on binaries, bu

    • by tomhath ( 637240 )
      Bit rot usually means portions of the code haven't been refactored after other parts have been changed. What might have been good enough when it was first written has degraded.
  • It's interesting that they can do this to the object code after it's been generated. But it sounds a lot like traditional compiler optimization to me. Or am I missing something?

    I would also quibble with the headline. It isn't "fixing" the code; it's generating more efficient instructions to do the same thing.

  • by T.E.D. ( 34228 ) on Friday July 10, 2015 @10:14AM (#50081511)

    Most compilers, after code generation (and often after code analysis, while the code is in some machine-independent tree state that sounds much like their invented language here) perform their optimization runs.

    This just sounds like its doing that. Sure, its source-language independent. But so is gcc's backend. Basically its taking a "done" excecutable, reconstructing a form similar to the intermediate tree form [gnu.org] the GCC backend uses, and then doing another optimization pass like GCC's backend does. What's new and unique here?

    • by decsnake ( 6658 )

      Basically its taking a "done" excecutable, reconstructing a form similar to the intermediate tree form [gnu.org] the GCC backend uses, and then doing another optimization pass like GCC's backend does. What's new and unique here?

      I think, and I'm not sure about this, that the original work is reconstructing an optimizable intermediate form. If anyone else has done it before I'm not aware of it, but I'm just a practitioner, not a CS researcher.

    • Umm, Crysis runs perfectly fine even on relatively low-end modern computers -- without even breaking a sweat. Either upgrade your computer from that thing from 2007, or find a new meme :p

  • by Viol8 ( 599362 ) on Friday July 10, 2015 @10:16AM (#50081531) Homepage

    Well its the obvious question isn't it - presumably they've eaten their own dog food and run the program on its own binary , right?

  • Reoptimizing is not the same thing as fixing a bug. For one thing, the software would need to know the bug is in fact a bug and what the correct behavior was supposed to be. Suppose your algorithm is coded as c = a - b, but it really is supposed to be c = a + b. Obviously that is a bug but how is the software supposed to know that? Or suppose you used an incorrect conversion factor, but the formula is otherwise correct (Didn't NASA have a lander crash because the program used feet instead of meters or som

  • This seems to optimize code, as opposed to fixing bugs. It won't detect for example a possibility of referencing nil. Also, I suspect it would need a very strongly typed language which would reduce the human intelligence requirement to zero. One other thing I am wondering is how maintainable the code is after. Does the code make sense or does it look more like it was written by a robot.
  • by Anonymous Coward

    In 10 years or less Software Engineers wont be necessary to create Software. Applications will be created by Management based on Business Rules and Data Sets. The only human involvement will be the IA Design and Control of these Rules and Data as well as UI Design for "look and feel" and branding.

    If you are in your 20s, you'd better start finding a new career path because your career in Software Development is about over.

    • They've been saying that for 30 years.

    • Applications will be created by Management based on Business Rules and Data Sets.

      It's sad but true. That's why I've been preparing my move into Management. I've been learning to use a cutting-edge Management Suite called "Java" to implement Business Rules, and a bleeding-edge Cloud-based Storage Medium named "MySQL" to manage my Data Sets.

      I'm going to miss Software Engineering, but as you say, the writing's on the wall.

      • "Business Rules" are implemented with rule engines, which might be implemented in Java.

        Clouds may be anything ... not just storage. And MySQL is not related to clouds in anyway.

        Your joke failed ...

        • Business rules are implemented however they are implemented. That can be a rules engine, or a more general language. And any rules engine I've seen tends to be infinitely extensible with your language of choice, because simply being a "rules engine" is rarely sufficient.

          And I didn't say clouds were "just storage", I said that MySQL was a "cloud-based storage medium", which is certainly true for the various vague definitions of "cloud-based" I've seen.

          See? My joke was accurate and - now that it has been f

          • If you mean with MySQL Your SQL, then it might be so. If you mean the open source DB now owned by Oracle, then this is not the case ... no clouds involved.

  • Remember Transmeta? (Score:5, Informative)

    by Anonymous Coward on Friday July 10, 2015 @10:39AM (#50081703)

    When you read the actual paper, it seems to analyse the code, and take advantage of vectorization in modern hardware to do the equivalent act in a more parallel way increasing performance. So it strips off local optimization, analyses the action, then recreates it in a form better suited to the processor its compiled for, resulting in a speedup.

    Sort of reminds me of Transmeta, the processor Linus was involved with, that ran microcode, analysed the execution of the x86 code, and did little mini compiles of microcode to simulate that. aka 'Code Morphing',

    https://en.wikipedia.org/wiki/Transmeta

    It did well on Benchmarks (and filters) because of the repetitive nature of them, it could recreate those loops more efficiently.

    Interesting, pity its all full of boastful fluff, and vague bitrot claims, it sounds like a proper paper otherwise. Of course for most software if it has value, it is a competition and its worth re-optimizing for each new platform otherwise you lose market to competitors. So actual programmers would rework those parts of the code rather than a general tool hack.

    So this would only be a stopgap for dead end software you're locked into that isn't being maintained.

    • One big advantage of this over Transmeta's "code morphing" is that the latter has to be done on-the-fly, and has to be done each time it encounters the code, unless it was recent enough to still be in cache, whereas this approach, like a normal compiler, is only done once, and then the results can be executed over and over (even across different systems).

  • After reading through the paper a bit, it seems interesting, but perhaps a bit overblown. It seemed to have a lot of work to understand the very specific problem domain before this could be applied. It's more like a methodology *enabled* expert engineers to do optimization, not that it did optimization *instead of* expert engineers.

    It's also a field with a lot of solid technical high level algorithms, so there was a pretty good space to map things to. Basically it was identifying what inscrutable code wa

  • by Virtucon ( 127420 ) on Friday July 10, 2015 @11:36AM (#50082157)

    Read the instructions, loop optimize, branch optimize etc. rinse repeat. It's not novel, I mean we've had JIT optimization and others for a few years now. It does nothing with "code" but with the binaries generated from code.

  • Buy our miracle product and your old car will shine like new! Just squeeze some out of the bottle and apply with a clean cloth, wipe it off, and look! Your old car's faded, scratched finish looks like brand new! Only $19.95, call to order now and get this special micro-fiber application cloth for free, a $9.95 value!

    That's what this reminds me of. It's something you use on a car you're selling off and want to get a better price for it, not for a car you're keeping, or at least keeping for long. It'll proba
  • Seems like this could be built into compilers. What am I missing?
  • "It's a gas!" -- Doug Benson
  • by wonkey_monkey ( 2592601 ) on Friday July 10, 2015 @01:52PM (#50083449) Homepage

    Microsoft Windows' IrfanView

    Microsoft (or even "Microsoft Windows") doesn't own IrfanView.

  • Windows NT 3.51, or Microsoft PowerPoint, or........
  • I've build Photoshop plug-ins and have gotten paid for it.
    2D graphics algorithm performance pretty much completely depends on relatively small inner loops.
    Minor improvements in the inner loop can have a major impact on performance.
    It's not uncommon manually optimizing the inner loop makes a 2D graphics algorithm several hundred times faster simply by reducing typecasting or doing subexpression elimination.

    I'm wondering how well this program optimizes code in other fields. How about optimizing an office appl

  • I was far more impressed by a Y2K conversion system developed at Queens University in Kingston which would analyze the binaries from several different machine architectures, turn them into flow graphs, and pattern match the flow graphs into re-created C++ code. Unfortunately the startup that was launched to bring that system to market imploded due to legal issues and fights, and never went much further than addressing the needs of a few early adopters.

    This system, on the other hand, seems focused on mer

  • If their stuff is that smart, could they please use it to fix the Adobe Flash vulnerabilities for once?
  • "Thou shalt run lint frequently and study its pronouncements with care, for verily its perception and judgement oft exceed thine."

For God's sake, stop researching for a while and begin to think!

Working...