“And the award for Best Data Matching Tool in a Supporting Role goes to…”
With Skynet having become self-aware last week and TCM’s Classic Film Festival starting this week, I thought this would be a good time to consider some of the “Great Data Quality and MDM Moments in the Movies.” Please feel free to add your own nominees in the comments section.
A screwball comedy directed by Howard Hawks and scripted by Billy Wilder should be terrific. Throw in Gary Cooper and Barbara Stanwyck and it should be a real delight.
That’s what you have in this little-known gem that stars Cooper as a bookish professor who falls in love with nightclub singer Stanwyck.
In one scene, Coop thinks he’s going into his own darkened bungalow to explain to a colleague staying with him how deeply in love he is. But unbeknownst to him, the number on the door — a “6” — has slipped upside-down to form a “9.” He’s actually in Stanwyck’s bungalow and confessing his love to her. Gunfights, chase scenes, fisticuffs, and wedded bliss ensue.
This was a case of bad data that actually led to a happy ending. But this is the movies, after all; you can’t count on that happening in real life.
This “woman vs. machine” story stars Spencer Tracy as an engineer who’s created EMERAC, a computer designed to “free the worker from the routine and repetitive tasks and liberate his time for more important work” — in other words, to replace the research department at a television network.
Spence ends up knocking heads with (surprise!) Katharine Hepburn, the chief researcher, who takes a John-Henry-like stance against the coming of automation.
Although EMERAC succeeds at heavy lifting such as language translation, it fails miserably in jobs that require making distinctions in subtle differences in data. Poor EMERAC finally goes berserk trying to keep up with an overload of such tasks, memorably confusing the Greek island of Corfu with the poem “Curfew Must Not Ring Tonight.”
Things have gotten somewhat better for AI software in the 50-odd years since. But it’s still “garbage in, garbage out” — hence the need for continued vigilance where data quality is concerned.
It’s the good guys against the Nazis in a race to find one of the world’s great archaeological treasures: the lost Ark of the Covenant.
A critical data set needed to pinpoint its location is inscribed on the headpiece to the staff of Ra. Both sides of the headpiece. The Jerries only have the data from one side, which means “they’re digging in the wrong place,” to quote Indiana Jones (Harrison Ford) and his sidekick Sallah (John Rhys-Davies).
Too bad for the bad guys they didn’t have a business rule in place specifying that “Both sides of all ancient medallions and amulets shall be examined for data relevant to the mission.”
The future governor of California plays a cyborg assassin sent from even further in the future to kill the mother of the unborn savior of the human race. (Got that?)
Ahnold knows her name — “Sarah Connor” — and the town she lives in. But despite having the most advanced computing power imaginable, his creators didn’t bother equipping him with basic identity resolution software.
As a result, he doesn’t know which Sarah Connor he’s looking for — so he just looks up all the Sarah Connors in the local phone book and starts killing them one by one. We as an industry must do better than that.
In this dystopian cult favorite, Sam Lowry (Jonathan Pryce) is a downtrodden government clerk who longs to escape the drudgery of his work and his life — but instead finds himself pulled into the heart of bureaucratic evil.
Lowry is assigned to clear up a case of wrongful arrest — an innocent man named Harry Buttle has been detained instead of Harry Tuttle, an anti-government terrorist. The cause of the error? A bug in the system. (Literally — a fly has dropped into a teletype machine at exactly the wrong moment, causing the printer to jump and issue the warrant for “Buttle” instead of “Tuttle”).
That simple piece of bad data ultimately leads Lowry to either a romantic escape or a tragic future, depending on whether you’re watching the North American or European release.
“All right, kid, here’s the deal. At any given time there are approximately 1,500 aliens on the planet, most of them right here in Manhattan. And most of them are decent enough, they’re just trying to make a living.”
So explains Agent Kay (Tommy Lee Jones) to Agent Jay (Will Smith) the context for the mission of the Men in Black. Their job is to keep track of these extraterrestrial guests and keep them out of trouble.
Think about what it must be like to manage the taxonomies and attributes on 1,500 alien life forms. You’ve got aliens with one head, two heads and more. Aliens with two, four and six arms. Are tentacles classified as “arms” or “legs”? Or are they a separate attribute? Shouldn’t Venusians be in the “Silicon-based Life Forms” category and the “Terrestrial Planet” category?
Clearly, a standard database or ERP system won’t do; this calls for an AIM (alien information management) solution that’s literally out of this world.
Computers were in their infancy in World War II. Machines that filled entire rooms were only capable of specialized tasks such as decoding enemy cables and computing artillery azimuths.
That left large organizations to depend on people to recognize patterns in routine data and information — as depicted in a heart-wrenching scene from this genre-redefining war movie.
Two of four brothers have been killed during the Normandy invasion. A third brother dies the same day in the Pacific. After witnessing the nightmarish landing at Omaha Beach, we’re taken into the labyrinth of the War Department, where a vast secretarial pool types letters of regret to families across the country.
Soon we see a horrified secretary stop her typing and rush to her superiors to report that she’s already prepared a letter for this same Gold Star mother. Her astute observation sets the rest of the story in motion as the Army assigns Captain John Miller (Tom Hanks) to lead a squad to find and bring home the lone surviving brother.
Organizations need current data from canonical sources for applications such as business intelligence and marketing communications. They get into trouble when they work with information that’s too far disconnected in time and context from its original source.
For example, when a delegation of aliens confuses TV sitcoms, “lost in space” for decades, with news reports and documentaries. That’s the premise that gets things rolling in this sci-fi spoof, as friendly aliens come to earth to recruit a Shatner-esque has-been TV star (“Commander Taggart,” played by “Jason Nesmith,” played by Tim Allen) to lead their fight with an oppressive planetary neighbor.
The visitors have mistaken TV programs like Nesmith’s old science fiction show — transmissions of which are still traveling into outer space years after their cancellation — as historical documents.
Nesmith’s co-star, Gwen DeMarco (Sigourney Weaver), tries to explain to the naive aliens that “They’re not ALL ‘historical documents.’ Surely, you don’t think Gilligan’s Island is a…” “Those poor people,” the alien leader sighs.
An intriguing mystery told in reverse chronological order: the story starts at the end and works its way to the beginning. We watch the sequence of events backward as Leonard Shelby (Guy Pearce), a man falsely accused of murder, tries to clear his name.
Because he suffers from short-term memory loss, Shelby must rely on an improvised collection of clues — hand-scribbled notes, Polaroid snapshots, and tattoos on his own body — to track down the real murderer.
Enterprises should have a central master data repository to store and manage this kind of mission-critical information. But here Leonard has no choice but to act as his own repository.
It’s 700 years in the future, and Planet Earth is so overrun with junk that it’s no longer habitable. The human race has been sent on a centuries-long voyage on a fleet of space-age cruise liners while a race of robots is left on Earth to clean things up.
(After generations of luxurious living aboard their intergalactic cruise ships, people have evolved into boneless blobs lounging in front of computer terminals for their shopping, entertainment, meals, and social lives. Wait…just how far into the future is this, anyway?)
Meanwhile, WALL-E (“voiced” by Ben Burtt), the last of the clean-up robots, continues his mission of compacting trash and stacking it in neat piles. He also collects various knick-knacks that catch his fancy to make his garage a little homier.
He stores his trove in vertical carousel shelves, and knows exactly what he has and where everything is; for example, he can quickly find a replacement part for himself when he needs it. We assume he has some kind of internal inventory system where he can keep track of the artifacts in his collection — which makes data accuracy essential.
However, one of his favorite pastimes is watching a video cassette of “Hello Dolly” on a salvaged VCR; he should consider transferring that to an MPG file both to preserve it and so he can manage it in a digital asset management (DAM) system.
* * * * *
Yes, I know. Four of the 10 movies are about aliens or robots, and several depict a dystopian future. I’m not sure what that says about me, or about movies featuring problems with data quality and data management.
There must be many other examples. Maybe something cheerier? Or from a different genre? What are your favorite data quality and MDM moments in the movies?