Sling Blade Runner
Check out this puzzle from ITA Software. (if that link ever goes bad, here is their puzzle archive).
“How long a chain of overlapping movie titles, like Sling Blade Runner, can you find?”
Use the following listing of movie titles: MOVIES.LST. Multi-word overlaps, as in “License to Kill a Mockingbird,” are allowed. The same title may not be used more than once in a solution. Heuristic solutions that may not always produce the greatest number of titles will be accepted: seek a reasonable tradeoff of efficiency and optimality.
I started tinkering with this last week and spent several evenings working on a solution. My first attempt was a brute force algorithm that tried every possible combination.
After letting my program run 24 hours, it was still processing the third movie…out of a list of 6561 movies! Clearly brute force is not the way to go.
Building the Data Structure
My current algorithm works as follows:
- Load the file into memory
- For each movie title, create a Movie object
- Each Movie has a list of “prefixes” and “suffixes”. For example, prefixes for “TO KILL A MOCKINGBIRD” are “TO”, “TO KILL”, and “TO KILL A”. I put all prefixes and suffixes into hash maps. Furthermore, I have a cross reference Map that uses an integer as a key that maps to String prefix and suffix values.
- I then create a graph, linking every matching prefix and suffix together, using the integers from my lookup table. Thus, each Movie object now has a
Setof every possible predecessor and successor movies. - Prune all movies that have no predecessors or successors. This eliminates 2092 movies right away.
- Walk the graph…more on this in a bit
Reading the data file, building the graph, and pruning the unwanted movies takes less than a second. Walking the graph is where it gets interesting.
Walking the Graph
Do this for each movie:
- The first movie is the starting point. Its distance is 0.
- Find all predecessors. Their distance is 1.
- Recursively continue, incrementing the count by 1 each time.
It’s a wee bit more complicated than that, but not much. For each “starting movie”, I create a new linked list. Each node in the list contains the distance from the starting point, a reference to the current movie, a reference to the starting point, and a link to the predecessor node in the linked list.
By keeping these linked list nodes (along with an extra Map in each Movie), I can avoid hitting the same movie more than once per starting point.
Lessons Learned
The graph contains *many* cycles. Thus, depending on the order in which you traverse the predecessors, you get different counts. On average, I find chains around 200 movie titles long. This takes around 15 seconds on my computer. I just randomly shuffle the children and keep re-running the app, hoping to find a longer chain.
I only have a single CPU machine. My algorithm is memory intensive due to all of the linked lists, but that’s no problem for several thousand movies. If I had more CPUs, I could very easily make it threaded to (hopefully) improve performance. For millions of titles, you’d have to come up with a different approach I’m sure.
It turns out that pruning those 2092 movies hardly matters in terms of runtime performance. That was a big surprise.
Just for Fun
I didn’t do this to apply for a job at ITA. It just looked like a good challenge. It turned out to be a bit harder than I first anticipated. I really don’t have any ideas for reliably finding the “longest” chains.
The Longest Chain
Here is my best chain so far.
1. MRS PARKER AND THE VICIOUS CIRCLE
2. CIRCLE OF FRIENDS
3. FRIENDS WITH MONEY
4. MONEY FOR NOTHING
5. NOTHING BUT TROUBLE
6. TROUBLE IN PARADISE
7. PARADISE ROAD
8. ROAD GAMES
9. GAMES PEOPLE PLAY NEW YORK
10. NEW YORK NEW YORK
11. NEW YORK COP
12. COP LAND
13. LAND OF THE DEAD
14. DEAD MAN
15. MAN ON FIRE
16. FIRE IN THE SKY
17. SKY HIGH
18. HIGH SPIRITS
19. SPIRITS OF THE DEAD
20. THE DEAD GIRL
21. GIRL IN THE CADILLAC
22. CADILLAC MAN
23. MAN OF THE HOUSE
24. THE HOUSE OF THE DEAD
25. DEAD BANG
26. BANG BANG YOURE DEAD
27. DEAD MAN WALKING
28. WALKING AND TALKING
29. TALKING ABOUT SEX
30. SEX AND THE OTHER MAN
31. MAN TROUBLE
32. TROUBLE EVERY DAY
33. DAY OF THE WOMAN
34. THE WOMAN IN RED
35. RED HEAT
36. HEAT AND DUST
37. DUST TO GLORY
38. GLORY ROAD
39. ROAD HOUSE
40. HOUSE PARTY
41. PARTY MONSTER
42. MONSTER IN A BOX
43. BOX OF MOON LIGHT
44. LIGHT OF DAY
45. DAY FOR NIGHT
46. NIGHT OF THE LIVING DEAD
47. DEAD OF NIGHT
48. NIGHT MOTHER
49. MOTHER JUGS AND SPEED
50. SPEED 2 CRUISE CONTROL
51. CONTROL ROOM
52. ROOM AT THE TOP
53. TOP GUN
54. GUN CRAZY
55. CRAZY AS HELL
56. HELL NIGHT
57. NIGHT AND THE CITY
58. CITY BY THE SEA
59. SEA OF LOVE
60. LOVE AND DEATH
61. DEATH WISH V THE FACE OF DEATH
62. DEATH BECOMES HER
63. HER MAJESTY MRS BROWN
64. BROWN SUGAR
65. SUGAR AND SPICE
66. SPICE WORLD
67. WORLD TRADE CENTER
68. CENTER STAGE
69. STAGE FRIGHT
70. FRIGHT NIGHT
71. NIGHT AND DAY
72. DAY OF THE DEAD
73. DEAD END
74. END OF DAYS
75. DAYS OF HEAVEN
76. HEAVEN CAN WAIT
77. WAIT UNTIL DARK
78. DARK CITY
79. CITY OF JOY
80. JOY RIDE
81. RIDE THE HIGH COUNTRY
82. COUNTRY LIFE
83. LIFE AS A HOUSE
84. HOUSE OF FRANKENSTEIN
85. FRANKENSTEIN AND THE MONSTER FROM HELL
86. HELL UP IN HARLEM
87. HARLEM RIVER DRIVE
88. DRIVE ME CRAZY
89. CRAZY PEOPLE
90. PEOPLE WILL TALK
91. TALK OF ANGELS
92. ANGELS WITH DIRTY FACES
93. FACES OF DEATH 4
94. 4 LITTLE GIRLS
95. GIRLS OF SUMMER
96. SUMMER CATCH
97. CATCH A FIRE
98. FIRE ON THE MOUNTAIN
99. THE MOUNTAIN MEN
100. MEN WITH GUNS
101. GUNS OF THE MAGNIFICENT SEVEN
102. THE MAGNIFICENT SEVEN RIDE
103. RIDE WITH THE DEVIL
104. THE DEVIL RIDES OUT
105. OUT OF THE PAST
106. PAST MIDNIGHT
107. MIDNIGHT RUN
108. RUN SILENT RUN DEEP
109. DEEP BLUE
110. BLUE CAR
111. CAR 54 WHERE ARE YOU
112. YOU LIGHT UP MY LIFE
113. LIFE WITH FATHER
114. FATHER OF THE BRIDE
115. BRIDE OF THE WIND
116. THE WIND AND THE LION
117. THE LION KING
118. KING OF THE JUNGLE
119. JUNGLE 2 JUNGLE
120. JUNGLE BOOK
121. BOOK OF LIFE
122. LIFE IS BEAUTIFUL
123. BEAUTIFUL GIRLS
124. GIRLS WILL BE GIRLS
125. GIRLS GIRLS GIRLS
126. GIRLS JUST WANT TO HAVE FUN
127. FUN AND FANCY FREE
128. FREE WILLY 2 THE ADVENTURE HOME
129. HOME ALONE 3
130. 3 NINJAS KICK BACK
131. BACK TO SCHOOL
132. SCHOOL OF ROCK
133. ROCK STAR
134. STAR TREK IV THE VOYAGE HOME
135. HOME ALONE
136. ALONE IN THE DARK
137. DARK STAR
138. STAR TREK THE MOTION PICTURE
139. PICTURE BRIDE
140. BRIDE OF THE MONSTER
141. MONSTER HOUSE
142. HOUSE PARTY 3
143. 3 NINJAS KNUCKLE UP
144. UP CLOSE AND PERSONAL
145. PERSONAL BEST
146. BEST OF THE BEST
147. THE BEST OF EVERYTHING
148. EVERYTHING RELATIVE
149. RELATIVE FEAR
150. FEAR X
151. X THE MAN WITH THE X RAY EYES
152. EYES OF AN ANGEL
153. ANGEL BABY
154. BABY SECRET OF THE LOST LEGEND
155. LEGEND OF THE LOST
156. THE LOST BOYS
157. BOYS LIFE
158. LIFE OR SOMETHING LIKE IT
159. IT TAKES TWO
160. TWO FRIENDS
161. FRIENDS AND LOVERS
162. LOVERS AND OTHER STRANGERS
163. STRANGERS WHEN WE MEET
164. MEET JOE BLACK
165. BLACK AND WHITE
166. WHITE HUNTER BLACK HEART
167. HEART CONDITION
168. CONDITION RED
169. RED EYE
170. EYE FOR AN EYE
171. AN EYE FOR AN EYE
172. EYE OF GOD
173. GOD TOLD ME TO
174. TO DIE FOR
175. FOR THE BOYS
176. BOYS ON THE SIDE
177. SIDE OUT
178. OUT COLD
179. COLD FEVER
180. FEVER PITCH
181. PITCH BLACK
182. BLACK HAWK DOWN
183. DOWN TO EARTH
184. EARTH GIRLS ARE EASY
185. EASY COME EASY GO
186. GO NOW
187. NOW YOU SEE HIM NOW YOU DONT
188. DONT GO IN THE HOUSE
189. HOUSE OF DRACULA
190. DRACULA DEAD AND LOVING IT
191. IT HAPPENED AT THE WORLDS FAIR
192. FAIR GAME
193. GAME OF DEATH
194. DEATH WISH
195. WISH UPON A STAR
196. A STAR IS BORN
197. BORN AMERICAN
198. AMERICAN ME
199. ME MYSELF I
200. I LOVE YOU TO DEATH
201. DEATH SHIP
202. SHIP OF FOOLS
203. FOOLS RUSH IN
204. IN PRAISE OF OLDER WOMEN
205. WOMEN IN LOVE
206. LOVE WALKED IN
207. IN COLD BLOOD
208. BLOOD DIAMOND
209. DIAMOND MEN
210. MEN CRY BULLETS
211. BULLETS OVER BROADWAY
212. BROADWAY DANNY ROSE
213. ROSE RED
214. RED RIVER
215. RIVER OF NO RETURN
216. RETURN TO HORROR HIGH
217. HIGH SCHOOL HIGH
218. HIGH CRIMES
219. CRIMES OF THE HEART
220. THE HEART OF ME
221. ME WITHOUT YOU
222. YOU ONLY LIVE ONCE
223. ONCE AROUND
224. AROUND THE BEND
225. BEND OF THE RIVER
226. THE RIVER WILD
227. WILD THINGS
228. THINGS TO COME
229. COME AND GET IT
230. IT HAPPENED ONE NIGHT
231. ONE NIGHT WITH THE KING
232. THE KING AND I
233. I WANT TO LIVE
234. LIVE FOREVER
235. FOREVER YOUNG
236. YOUNG SHERLOCK HOLMES
237. SHERLOCK HOLMES AND THE VOICE OF TERROR
238. TERROR BY NIGHT
239. NIGHT FALLS ON MANHATTAN
240. MANHATTAN MURDER MYSTERY
241. MYSTERY ALASKA
242. ALASKA SPIRIT OF THE WILD
243. THE WILD ONE
244. ONE EIGHT SEVEN
245. SEVEN YEARS IN TIBET
I like little problems like this and your graph solution seems like a good one. Since there’s been a lot of talk about Patricia tries lately from the with regards to google-collections, that sprang to mind as a possible solution as well. Given a starting movie, you could find all of the ones that are prefixed by its suffixes in a pretty efficient way. Plus, it’d have a lean memory footprint because there’d be no overlapping data stored for the strings. Here’s the javadoc for the limewire’s PatriciaTrie if you’re interested.
http://www.limewire.org/nightly/modules/collection/api/org/limewire/collection/PatriciaTrie.html
I tried to solve this problem as well. I found a solution with 246 titles and then reviewed my results. Unfortunately, I missed one. I should have found 247. Back to the drawing board.