Tuenti Challenge 8
An explaination of my solutions to the #tuentichallenge8
Introduction
First of all, a huge thank you to all the organizers.
In this post I talk about my experience and explain some parts of the code, I assume the reader is familiar with the challenges, so keep this tab open:
My code is available here:
https://github.com/Badel2/TuentiChallenge8
If you have any questions or comments, join the discussion on reddit .
And since this turned out pretty long, if you are only interested in some specific challenges use this links to jump to them:
P01 P02 P03 P04 P05 P06 P07 P08 P09 P10 P11 P12 P13 P14 P15
P01
Waffles
The logic for this challenge is pretty straight-forward, so I will explain my setup instead. I use debian as the operating system and Rust as the programming language. In all the challenges I read the input from stdin and print the output to stdout. To compile and run the code I run a command similar to this:
rustc p01.rs
./p01 < testInput > testOutput
Or just
cargo run -- < testInput > testOutput
This is the code that reads the input. It may seem quite verbose compared to other languages, and indeed it is.
use std::io;
fn main() {
let mut input = String::new();
io::stdin().read_line(&mut input).unwrap();
let num_cases: u32 = input.trim().parse().unwrap();
for i in 0..num_cases {
input.clear();
io::stdin().read_line(&mut input).unwrap();
let nm: Vec<u32> = input
.trim()
.split(" ")
.map(|s| s.parse::<u32>().unwrap())
.collect();
let n = nm[0];
let m = nm[1];
println!("Case #{}: {}", i + 1, waffles(n, m));
}
}
Notice all the calls to unwrap()
? In a real application they would be
replaced with proper error handling: reading from stdin can fail, parsing a
string as an integer can fail. Instead of throwing exceptions, Rust favors
explicit error handling.
This is the equivalent to getline
, we need to clear the input
because any new data is appended to the input string.
input.clear();
io::stdin().read_line(&mut input).unwrap();
Parsing a line with only one integer is easy, just annotate the type somewhere:
let x: u32 = input.trim().parse().unwrap();
let x = input.trim().parse::<u32>().unwrap();
But parsing multiple elements per line is a bit more complicated, we need to split the input string, parse every part, and store it into a vector.
let nm: Vec<u32> = input
.trim()
.split(" ")
.map(|s| s.parse::<u32>().unwrap())
.collect();
Then we just index this vector to get the desired input, which is pretty ugly:
let n = nm[0];
let m = nm[1];
There is an experimental slice pattern syntax (stable since Rust 1.26) which would transform that into:
let mut arr = [0; 2];
arr.copy_from_slice(&nm);
let [n, m] = arr;
Which is slightly nicer as long as you ignore the first two lines.
All this code can probably be replaced with a Python one-liner, so solving this simple challenges in Rust is just personal preference.
P02
Nazi base
I usually use cargo
to manage my Rust projects, but when a challenge can be
solved without using any external libraries, I just submit the main.rs
file.
In this challenge I had to use the
num_bigint
crate because we need to be able to calculate up to about 26^26
, and 64 bits
are not enough. However 128 bits would have been enough, and Rust has support
for 128 bit integers since version 1.26. The funny thing is that at the time of
the challenge, the stable Rust version was 1.25, so I decided not to use these
nightly features. As a result, the code is slightly more verbose.
See blog/p02_u128/ for the exact same code using u128.
The solution depends only on the input string length, so I precalculated all the possible values into a lookup table.
The algorithm is pretty simple, it just calculates the maximum value (543210) and the minimum value (102345) and returns the difference.
let mut min = 0;
let mut max = 0;
let mut p = 1;
for x in 0..base {
max += x * p;
// No leading zeros: transform 012 into 102
min += match base - x - 1 {
0 => 1,
1 => 0,
n => n,
} * p;
p *= base;
}
return max - min;
For the minimum value we need to swap 0
with 1
, so 012
becomes 102
,
because 012
is not a valid number (it cannot have leading zeros). This is
accomplished using Rust’s match, which returns a value so we can inline it like
that. The equivalent of that match in C would be:
n = base - x - 1;
min += (
(n == 0) ? 1 :
(n == 1) ? 0 :
n
) * p;
No, wait, that’s too readable, how about:
min += (n<2?1-n:n) * p;
Nice.
P03
Scales and notes.
The algorithm is pretty simple:
- Create all 24 possible scales
- Remove duplicate notes from input
- For each note: remove all the scales which don’t contain this note
And we are left with all the scales that contain all the notes.
fn all_scales_containing(n: &[Note]) -> String {
let mut a = Scale::all_scales();
// We make a copy of n because we want to modify it
let mut uniq_notes = n.to_vec();
uniq_notes.sort();
uniq_notes.dedup();
for ref n in uniq_notes {
a.retain(|ref s| s.notes.contains(n));
}
...
The code is simple thanks to the retain
function, which iterates over the
vector and removes all the elements for which the predicate returns false.
contains
is also a builtin function, which simply searches an element in a
vector. There are many useful functions which greatly simplify the code,
sometimes it’s a pain to work without them.
As for the parsing code, Note
is just an enum:
enum Note {
A,
As,
B,
C,
Cs,
...
And then we have a giant match statement to match strings to notes, merging equivalent notes to the sharp version:
match s {
"A" => A,
"A#" => As,
"Bb" => As,
"B" => B,
"B#" => C,
"Cb" => B,
"C" => C,
"C#" => Cs,
"Db" => Cs,
...
Before creating all the scales we need to create all the notes, and since Rust is a type safe language we can’t just do
for(i=0; i<12; i++) {
Note n = (Note)i;
notes.push(n);
}
There exist crates which provide that functionality, but the
workarround is just to copy the enum definition and add vec![
before it, like
this:
fn all_notes() -> Vec<Note> {
vec![
A,
As,
B,
C,
Cs,
...
Although it is valid to do the opposite conversion, for example to check if
a scale is a major scale we cast the Note
to u8
:
let major = ((12 + self.notes[2] as u8 - self.notes[1] as u8) % 12) == 2;
A simple test to verify that the code works is to print all the scales, which can be accomplished with this one liner:
println!("{:#?}", Scale::all_scales().iter().map(|s| (s.name(), s.notes.clone())).collect::<Vec<_>>());
Which gives us a nicely formatted output:
[
(
"MA",
[
A,
B,
Cs,
D,
E,
Fs,
Gs
]
),
(
"MA#",
[
As,
C,
D,
Ds,
F,
G,
A
]
),
...
P04
Knight Princess Exit
This challenge consists of finding the shortest path from the knight to the princess and then to the exit. I don’t know about you, but when I hear the shortest path I immediately think Dijkstra. So yeah, if we can transform the map into a graph we’re done. There are two paths to calculate but we can treat them as two independent problems since in the worst case we will have to calculate everything twice, which isn’t a big deal.
I opted for generating the edges on demand. We store the map as a
Vec<Material>
, we index it using a 2D Pos { x, y }
, using the simple
x + y * width
formula to get the element from the 1D vector. The important
function is get_edges
which returns all the possible jumps that the knight
can make, depending on the material of the current square: ground, trampoline
or lava (there are no possible jumps when on lava, symbolizing the inevitable
death). Since there are only 8 possible jumps I just quickly copy-pasted them
all:
pos.push(Pos { x: p.x + 2, y: p.y + 1, });
pos.push(Pos { x: p.x + 2, y: p.y - 1, });
pos.push(Pos { x: p.x - 2, y: p.y + 1, });
pos.push(Pos { x: p.x - 2, y: p.y - 1, });
pos.push(Pos { x: p.x + 1, y: p.y + 2, });
pos.push(Pos { x: p.x + 1, y: p.y - 2, });
pos.push(Pos { x: p.x - 1, y: p.y + 2, });
pos.push(Pos { x: p.x - 1, y: p.y - 2, });
pos.retain(|&p| p.x < self.size.0 && p.y < self.size.1);
Finally I had the idea to remove jumps which leave the knight outside the
board, but that optimization is not really needed since we return “no edges”
when called with a position outside the board anyway. However, since the
position is stored as an unsigned integer (u32), when we substract from it we
can get a negative number, which will overflow. In this case that won’t be a
problem because -1
is 2^32 - 1
which is well outside the board anyway, but
Rust has overflow checks enabled by default in debug mode, so it will panic at
runtime if there is an overflow. One solution would be to explicitly state that
we want to use wrapping arithmetic operations:
pos.push(Pos { x: p.x.wrapping_add(2), y: p.y.wrapping_sub(1), });
But I didn’t want to rewrite that 16 lines of code, so I just compiled in release mode, which disables overflow checks.
As for the Dijkstra algorithm, I just copied it from
here
with a small change to use a HashMap
instead of a Vec
for keeping track
of the minimum cost of each node.
P05
DNA slices
In this challenge we need to connect to a server. This is the base code which starts a TCP connection and enters a read loop:
use std::io::{Read, Write};
use std::net::TcpStream;
fn main() {
let mut stream = TcpStream::connect(("52.49.91.111", 3241)).unwrap();
let mut buffer = vec![0u8; 1024];
//let state = b"TEST\n";
let state = b"SUBMIT\n";
'tcp: loop {
match stream.read(&mut buffer) {
Ok(0) => break 'tcp,
Ok(n) => {
let buf = &buffer[0..n];
let s = String::from_utf8_lossy(&buf);
println!("{}", s);
if let Some(_) = find_subsequence(&buf, br#"> Please, provide "TEST" or "SUBMIT""#)
{
stream.write(state).unwrap();
} else if buf[0] != b'>' {
// DATA
//[...]
}
}
Err(e) => panic!("Error reading stream: {:?}", e),
}
}
}
The buffer
is just a 1024-byte array, when we read from the stream we don’t usually
fill the buffer so we alias the first n
bytes:
let buf = &buffer[0..n];
Both buffer
and buf
work on bytes, not on chars. The first this I do is convert
the buf
into a String
, and print it. This is important because we want to save
the token when we get it!
We can use the find_subsequence
helper function to search for some bytes in
buf. We use that to write “TEST” or “SUBMIT” when asked to.
We work on Vec<u8>
instead of String
because Rust has proper unicode
support so we can’t index strings, and even something as simple as a strlen
becomes O(n), because number of bytes !=
number of chars. This is why the
literals begin with a b
: b"TEST\n"
is a byte array, not a string.
I assumed that the lines which do not begin by >
are DNA parts.
We parse them using the usual trim().split(" ")
combo but this time we also
add .enumerate()
, to keep track of the index of each part, since we need that
for the result:
// DATA
let parts: Vec<_> = s.trim()
.split(" ")
.enumerate()
.map(|(i, s)| Part::new(i + 1, s.as_bytes().to_vec()))
.collect();
println!("PARTS: {:#?}", parts);
let idx = find_original_samples(&parts);
println!("Result: {}", idx);
stream.write(idx.as_bytes()).unwrap();
Notice that here we get the parts from a String
, but convert to Vec<u8>
using s.as_bytes().to_vec()
. Similarly, the result of find_original_samples
is also a String
, and we must convert it to &[u8]
to send it through the
stream. When I was a beginner stuff like this was a pain. Now it still is a
pain but I’m used to it.
It took me a while to understand the question: there is an original sample, which is repeated twice:
aaccGgTTC aaccGgTTC
Then this two samples are randomly split into parts:
aacc GgTTC aaccG gTTC
Then some random parts are added to the mix, and the whole thing is shuffled:
aacc GgTTC aaccG gTTC aaaa cccc gttc
Now our task is to find which parts are from the original sample. Since the two original samples are randomly split we can’t just compare the parts, however we know that the beginning and end of each sample will be intact, so we can compare that. For example, we take the first part and check if any other part shares a common start with it:
aacc
GgTTC aaccG gTTC aaaa cccc gttc
Then, we take the remaining letters (in this case the last G
from aaccG
),
and repeat the process until no parts match:
G
GgTTC gTTC aaaa cccc gttc
gTTC
gTTC aaaa cccc gttc
aaaa
cccc gttc
Then, the original parts are the ones that we removed from the list.
Now back to code:
There is a helper function which checks if two parts share a common start,
and returns the remaining letters, or None
if they don’t match:
fn share_common_start(a: &[u8], b: &[u8]) -> Option<Vec<u8>> {
// Make sure a is the longest string:
let (a, b) = if a.len() > b.len() { (a, b) } else { (b, a) };
if a[0..b.len()] == *b {
Some(a[b.len()..a.len()].to_vec())
} else {
None
}
}
For example:
a: TTCGGTCT
b: TTCG
returns Some(GTCT)
The main loop calls find_original_samples
which takes care of the output
format, but the good stuff is in the find_samples
function.
fn find_samples(start: &[u8], p: &[Part]) -> Option<Vec<usize>>
It takes the start substring, a list of parts, and returns the indexes of the
parts which are from the original sample. When called with an empty start
, it
will iterate over all the parts trying each of them as a start. If it finds a
match, it will recursively find all the possible “original sample parts”. Here
is the code for the case where the start is not empty:
for x in p {
if let Some(new_start) = share_common_start(&x.data, start) {
let mut rem_p = p.to_vec();
// Remove x from the remaining parts
rem_p.retain(|t| t != x);
if let Some(indexes) = find_samples(&new_start, &rem_p) {
let mut ix = indexes;
// Add x to the original sample index list
ix.push(x.idx);
return Some(ix);
}
}
}
return None;
It’s a nice recursive algorithm. I got the idea from a similar problem, the merged string puzzle from the datagenetics blog , which is a really nice blog and you should definitely check it out.
P06
Button Hero
This was the first problem where I had to rewrite my program after passing the test case, because I couldn’t even solve the first submit case.
To begin, it is important to realize that we don’t care about the position and speed of the notes, we only need the time when we must press and release the button. The transformation is simple:
let t0 = x / s;
let t1 = (x + l) / s;
We will store these values in a struct called Interval
:
struct Interval {
t0: u32,
t1: u32,
}
My original algorithm consisted of a divide and conquer approach where I split the map into non-overlapping clusters and solve each cluster independently. By solving each cluster I mean trying all the possible combinations: press one note (and remove all the notes which overlap with it), this will create new clusters which will be solved recursively, until there are no more notes remaining. In the end we just return the maximum score.
The new and working algorithm builds a graph with the time intervals. For each interval, its children are the intervals which can be pressed after that one, with the restriction that if we can press a note for free, we must take it. Consider the following example:
.AAA............
......BBBBBBBB..
.......CCC......
...........DDD..
^ ^
B.t0 B.t1
Here if we take as A’s children all the notes with t0 < B.t1
, we will have 3
paths: AB, AC, AD. However the AD path would be wasteful because we can take
the ACD path. So we don’t consider D to be A’s child. We will see the code
later.
First we build a Map
, which just stores all the intervals ordered by t0
.
struct Map {
score: u32,
intervals: BTreeMap<Interval, u32>,
}
We use a BTreeMap
(ordered_set) because it provides the range()
function,
so we can easily iterate over Interval
s based on restrictions on t0
. The
score field used to store the extra score gained for pressing trivial notes,
the ones which do not overlap with anything else. But we don’t use that
optimization in the graph approach. Thanks to the entry API we can build the
BTreeMap
while merging all the identical intervals, two notes A and B with
the same t0
and t1
will be merged as a note N with score A+B:
let mut intervals = BTreeMap::new();
for n in notes {
let a = Interval::from_note(&n);
*intervals.entry(a).or_insert(0u32) += n.p;
}
Usually working with graphs in Rust is a pain because of the ownership rules. Luckly, in this case the graph doesn’t have any cycles, so it’s not that bad. Each node can have multiple children, which isn’t a problem, but each child can have multiple parents. With multiple parents we need reference counting to make sure that there is no node with a reference to this child when deallocating it. This is one of the main selling points of Rust, to ensure memory safety and forget about invalid pointers! Aside from reference counting, there is also the arena approach where an arena owns all the nodes, so they are all deallocated at once, and the nodes are accessed using indexes to the arena instead of pointers. But I wanted to try the reference counting approach, so our nodes will look like this:
struct Node {
t: Interval,
score: u32,
children: Vec<Rc<Node>>,
}
Rc<Node>
is a reference-counting pointer to Node
, just like a
std::shared_ptr
in C++.
To build our graph we need to convert the Interval
s into Rc<Node>
, but once
we insert the Rc<Node>
into the graph we cannot mutate it. That’s another
selling point of Rust, if a resource is shared, it can not be mutated. This
means that we can only add children before adding a node to the graph. I guess
it should be possible to create the graph in reverse order, starting at t0=max
and ending at t0=0, but I wanted to start at t0=0. So, when Rust doesn’t allow
you to mutate something, you just throw a RefCell
at it:
struct Node {
t: Interval,
score: u32,
children: RefCell<Vec<Rc<Node>>>,
}
A RefCell
is basically a way to enforce the “no mutability when shared” rule
at runtime. It will only let you mutate the inner value if nobody else is using
it.
Let’s start building the graph. First we convert the intervals into Rc<Node>
.
// self.intervals: BTreeMap<Interval, u32>
let rcn: HashMap<_, _> = self.intervals
.iter()
.map(|(i, s)| (i, Rc::new(Node::new(*i, *s))))
.collect();
For some reason I decided to store the Rc<Node>
s in a HashMap<Interval,
Rc<Node>>
. This way we can work with self.intervals
, and index rcn
to
get the corresponding Rc<Node>
for each Interval
.
Next we create the root node, which is just an interval with t0=0. Then, we take the interval with the lowest t0 (the first note to be pressed) and try to find the minimum t1 over all the intersecting notes.
let mut root = Node::root();
let (first, _first_s) = self.intervals.iter().next().unwrap();
let mut true_limit = first.t1;
for (x, _s) in self.intervals.range(first..) {
true_limit = min(true_limit, x.t1);
if x.t0 > true_limit {
break;
}
}
That’s the optimization I was talking about before, here the true_limit
for A would be C.t1, and A’s children would be B and C:
.AAA............
......BBBBBBBB..
.......CCC......
...........DDD..
^
C.t1
This is an optimization which I accidentally didn’t implement, because in my
code I wrote if x.t1 > true_limit
(facepalm). But I thought it would be
necessary, and the difference in runtime is pretty noticeable: 1.429s vs
0.517s on submitInput.
Once we have the true_limit
, we want to add all these nodes as children:
let mut children = vec![];
let limit_up = Interval {
t0: true_limit,
t1: u32::MAX,
};
for (x, _s) in self.intervals.range(first..&limit_up) {
children.push(Rc::clone(&rcn[x]));
}
root.ch = RefCell::new(children);
root
The function returns the root node. Once we have the graph we just need one simple function:
fn score_plus_max_of_children(&self) -> u32 {
let ch = self.ch.borrow();
let mut mc = 0;
for n in ch.iter() {
let a = n.score_plus_max_of_children();
mc = max(mc, a);
}
self.s + mc
}
This function has a pretty descriptive name, it returns the score of the node plus the score of its best child, plus the score of the best child of its best child, keep going until a node has no children. When called on the root node, it will return the maximum score for this graph. However we are missing something very important, to avoid calculating the score of a node multiple times we need to cache the result! Without a cache we can’t even calculate the first case from submitInput.
Recursion is key, but recursion + cache is god.
– Me after completing P10
struct Node {
t: Interval,
s: u32,
ch: RefCell<Vec<Rc<Node>>>,
max_of_children: RefCell<Option<u32>>,
}
And again, since a node can be shared by multiple parents, we need a RefCell
to introduce mutable state. Well, in this case I should have used a Cell
but
I was too lazy to add use std::cell::Cell;
.
// Using RefCell
fn score_plus_max_of_children(&self) -> u32 {
// Use cached result when possible
if let Some(s) = &*self.max_of_children.borrow() {
return self.s + s;
}
let ch = self.ch.borrow();
let mut mc = 0;
for n in ch.iter() {
let a = n.score_plus_max_of_children();
mc = max(mc, a);
}
*self.max_of_children.borrow_mut() = Some(mc);
self.s + mc
}
And with this small change our code completes the submitInput in less than a second! Recursion is truly magic.
RefCell vs Cell
But going back to RefCell
vs Cell
:
The difference is that RefCell
has a reference count and it won’t let you modify the value if it is already
borrowed by someone else, while Cell
just gives you a copy of its value, so
it has zero overhead. But
Wait, it looks like Cell
cannot be used with vectors, or anything
that involves allocation because it needs a Copy
type.Cell
can be used on any type now, (although we will lose
the ability to Debug
and Clone
nodes) so let’s change our code to use Cell
instead and see how big is that overhead:
RefCell: 0.517s, Cell: 0.476s
Well, it’s something. The main difference is that when using RefCell
the
children vector always holds a valid value, but with the Cell
we literally
take the vector when we need it, leaving an empty vector in its place, and we
need to put it back later:
// Using Cell
fn score_plus_max_of_children(&self) -> u32 {
// Use cached result when possible
if let Some(s) = self.max_of_children.get() {
return self.s + s;
}
let ch = self.ch.take();
// self.ch is now an empty vector
let mut mc = 0;
for n in ch.iter() {
let a = n.score_plus_max_of_children();
mc = max(mc, a);
}
self.ch.set(ch);
// self.ch is set to its original value
self.max_of_children.set(Some(mc));
self.s + mc
}
Notice that the “empty value” problem doesn’t exist with max_of_children
,
because we can safely copy that type (Option<u32>
) so when we need to read
the value we just call get()
, which returns a copy.
You can check out the Cell
version, where I also fixed the x.t1 >
true_limit
bug, here:
blog/p06_fixed
P07
GameBoy ROM
I had already downloaded a GameBoy debugger and began dissassembling the ROM. I was looking for some strange function with a decryption mechanism that would be activated with a certain button combination. But then I realized that the “encrypted” string has a “=” at the end. I don’t want to spoil the solution, look at the code if you are curious. But how did I know that the unknown file was a ROM, and how did I extract the string? With these useful commands:
$ file unknown
unknown: Game Boy ROM image: "UNKNOWN" (Rev.01) [ROM ONLY], ROM: 256Kbit
$ xxd unknown | less
$ strings unknown -n 100 > testInput
P08
Doors
We sum the door index to its offset so the problem becomes finding the instant when all the doors are open at once.
We can easily merge two doors: if door A opens every 3 seconds and door B opens every 5 seconds, then they are equivalent to a door C which opens every 15 (3*5) seconds. And if we can merge two doors we can merge all the doors, so the solution for this challenge will just be the offset at which that super-door opens!
But how to calculate the offset of the merged door? I had no clue, so I just
bruteforced it. It worked for the test case, but it was pretty slow for the
submit case. It took exactly 30 minutes to calculate case 100, and I was
missing 50 cases so I calculated the expected runtime to be about 20 hours, or 5
hours if I implement multithreading split the input into 4 parts and run
the program multiple times at once.
So I left one thread running while I think about a better solution.
Let’s start with a few simple cases. We will add an offset to both doors so that door A opens at t=0. Then, if door B also opens at t=0, door C (A+B) will open at t=0. If instead, door B opens at t=1, then both doors will coincide at t=6 (0+3+3, 1+5). Let’s see all the cases:
0123456789012345 A..A..A..A..A..A..A..A..A..A..A B....B....B....B....B....B....B .B....B....B....B....B....B.... ..B....B....B....B....B....B... ...B....B....B....B....B....B.. ....B....B....B....B....B....B. 0, 6, 12, 3, 9
The doors coincide at t: 0, 6, 12, 3, 9. Notice any pattern? It’s 0+6+6-9+6.
Most of the terms just add 6 to the previous one, expect there is a jump from
12 to 3. 12+6 = 18, and both doors are also open at t=18. But since we only
look at the first time both doors are open, and everything repeats every 15
seconds, we must substract 15, or even better: use the modulus operator: 18 %
15 == 3
.
So there is an unknown number, which we will call tt
. The formula for the
time when both doors are open is:
t = a.offset + tt * b.offset
For A=3 and B=5, tt=6. Now, what is 6? 6 is 3*2, and also 5+1. I repeated the experiment with other values, using a sheet of paper, and got the following table:
A B tt
3 5 6
5 3 10
4 5 16
5 4 5
3 7 15
7 3 7
We can see a pattern: tt = A*n
and tt = B*m + 1
. In other words,
tt % A == 0
and tt % B == 1
.
That’s basically the definition of modular multiplicative inverse :
The modular multiplicative inverse (modulo M)
of an integer a
is an integer x
such that:
ax = 1 (mod M)
If we combine that definition with our equations:
a*x % M == 1
tt % B == 1
tt % A == 0
A=a
B=M
tt=a*x
Therefore, we just need to find x: the modular multiplicative inverse of A modulo B,
and tt is just tt = A * x
.
Here I didn’t show any Rust code because it doesn’t look exactly nice. Since I
had to use BigInts again, and this time the task was not trivial, the code
became pretty messy pretty fast. For example, we must add an ampersand (&
)
when doing any operation with BigInts because otherwise the BigInt is literally
moved away and doesn’t exist under the old name anymore:
let mut t = &a.p - &a.t;
let mut steps = &p / &a.p;
Also, I forgot to consider the case where two doors share a common factor: A=6
B=10 repeat every 30 seconds, which I got right, but the value of tt
is not
the one we get using the modular inverse formula. Luckly I could just fallback
to the old bruteforce algorithm in that case, and it worked better than I
expected.
I have to admit that at some point I thought about just switching to Python, because this challenge can be solved applying a few mathematical formulas so there is little benefit in using Rust.
P09
Unscramble a photo.
This was fun, because I had an algorithm in mind but it didn’t work at all, it was a complete failure. My idea was to sort the columns to minimize the difference between one column and its surroundings. This difference was calculated as the sum of the differences in color between one pixel (x=0, y=0) and the corresponding pixel in the next column (x=0, y=1):
fn similarity(&self, a: &Vline) -> f64 {
let mut x = 0.0;
for i in 0..self.pixels.len() {
x += pixel_diff(&self.pixels[i], &a.pixels[i]);
}
x
}
fn pixel_diff(a: &Rgb<u8>, b: &Rgb<u8>) -> f64 {
let mut x = 0.0;
for (ca, cb) in a.channels().iter().zip(b.channels()) {
let ca = *ca as f64;
let cb = *cb as f64;
x += f64::abs(ca - cb);
}
x
}
But as I said, it didn’t work as expected, the image was indistinguishable from the scrambled one. But don’t worry, I always have more ideas. The next idea was to sort the image using diagonal lines as guides. In the test image there are a few obvious ones just below the QR Code. This is the result of sorting by of one of these lines (I forgot which one):
Also, check out this short animation where I try to find an optimal threshold:
As you can see, this almost works. At that point I could have tried to improve my algorithm, perhaphs looking for multiple guide-lines, or doing some color-based clusterization, but since I could almost see the QR code and I heard that these codes have nice error correction, I tried to scan it. It didn’t work. But I thought that I can manually reconstruct the missing bits, since the image is almost sorted anyway, and I have seen people reconstructing a QR code from less than that . So I decided to make a tool to help me sort the image line by line.
I used SDL2 since I already have some experience with it, and the Rust bindings are pretty nice.
The tool consists of a window with the image on the background and a black vertical line which represents the current selection. That black line is only aligned with the image when the image width is equal to the window width. I would post a screenshot, but there’s nothing to show, really. All the commands are performed using the keyboard:
A
and D
move the selection left or right.
W
and S
increase or decrease the movement speed.
H
and L
move the selected column left or right.
Space
toggles show/hide selection.
F
toggles faster movement.
R
rotates the columns of the image.
Q
sorts the image by diagonal lines, pressing it multiple times changes the y
offset where it looks for lines.
And very important: at any step that modifies the image, we save it and load it again. That’s just because I was too lazy to save the image in RAM. But as a side effect now we have all these cool videos! :D
Check out this album, it shows what happens when we press Q
multiple times:
On the submit image we start with almost half of the QR code!
I have some videos of the manual reconstruction process, the submit one is the longest with 1:06 minutes and only 10.3 MiB, even though it is made of 3962 PNG images with a total size of 18.8 GiB, video compression is truly amazing.
You can watch them on YouTube, because Imgur doesn’t allow videos longer than 30 seconds. I recommend fullscreen and the highest resolution:
In the submit image I was missing two corners in the end, so I just painted them using GIMP:
Total time spent was 1 hour for the test case and 3 hours for the submit case.
P10
Dance
First let’s consider the case with no grudges, how many ways are there to make pairs such that no lines cross? Obviously we can’t make pairs if the number of people is odd.
P N
2 1
4 2
6 5
8 ?
P=8 is not trivial, so enjoy my ascii drawing skills: The top left pixel is 0 and we number the people clockwise.
* * *-* * * * * /* *
* * * * * \ * *| * * *
* * * * * -* *| * * *
* * * * * * * * * *
We can split the P=8 case into 4: if we pair 0 and 1, we are left with 6 people and we already know that there are 5 ways to pair them. Next we pair 0 and 3, which leaves 2 people on the right and 4 people on the left. We also know how to solve these cases. The last two partitions are just reflections of the first two.
N(8) = N(6) + N(2) * N(4) + N(4) * N(2) + N(6)
N(8) = 5 + 1*2 + 2*1 + 5 = 14
Now that we have a few numbers, let’s google the sequence to see if it matches
something: 1, 2, 5, 14
results in https://oeis.org/A007051, and N(10)
should be 41. Let’s try to follow the same algorithm, but this time I won’t
draw it:
N(10) = N(8) + N(2) * N(6) + N(4) * N(4) + N(6) * N(2) + N(8)
N(10) = 14 + 1*5 + 2*2 + 5*1 + 14 = 42
Classic off-by-one, except it is not. If we now google the sequence
1, 2, 5, 14, 42
, the first link is the wikipedia page for
Catalan numbers
. These numbers seem to be very important in combinatronics, so this time
N(12) must be 132.
But we already have a nice algorithm, so let’s implement it. We can easily add grudges: before splitting the circle into two just check that the two people we chose can form a pair.
struct Dance {
p: u32,
g: HashMap<u32, Vec<u32>>,
}
p
is the number of people, and g
is the list of grudges.
The grudges are offset-based: a grudge between 4 and 3 will be represented as
g[3].push(4-3);
, so we can access all the grudges given the lowest index.
I used a HashMap
instead of a Vec
because I expected the grudges to be
sparse: not all people should have grudges.
This algorithm can be easily implemented in a nice recursive way: split the dance into two until there’s no more people left.
fn calc(&self) -> u32 {
if self.p % 2 != 0 {
return 0;
}
if self.p == 0 {
return 1;
}
// recursion!
let mut acc: u64 = 0;
let mut i = 1;
while i < self.p {
// Num combinations = product of combinations of each part
if let Some((dl, dr)) = self.split_into_two(0, i) {
acc += (dl.calc() as u64) * (dr.calc() as u64);
acc = acc % 1_000_000_007;
}
i += 2;
}
acc as u32
}
The split_into_two
function returns None
when there is a grudge between 0
and i, and otherwise returns the two new Dance
s as Some((dl, dr))
. These
new dances can then be recursively solved.
fn split_into_two(&self, i: u32, j: u32) -> Option<(Dance, Dance)> {
if i != 0 {
unimplemented!()
}
// check if 0 and j form a grudge
if self.g.get(0).map_or(false, |x| x.contains(&j)) {
return None;
}
let drp = j - i - 1;
let mut drg = HashMap::new();
// Change the offset of the grudges, since we remove the 0
// Move g[x] to g[x-1]
if drp > 0 {
for x in i + 1..j {
if let Some(grs) = self.g.get(&x) {
for &gr in grs {
if gr < j - x {
drg.entry(x - 1).or_insert(vec![]).push(gr);
}
}
}
}
}
let dr = Dance::new(drp, drg);
// Same for dl
// Move g[x] to g[x-j-1]
// [...]
Some((dl, dr))
And while this algorithm works pretty well for the test case, we can’t use it for the submit case because the recursion gets slower and slower with more people. So while thinking about a solution, I added some optimizations.
For example, we can add a list of the Catalan numbers and use it when there are no grudges, this should save a few levels of recursion:
// fn calc(&self) -> u32
let n = self.p as usize / 2;
const CATALAN: &[u32] = &[1, 1, 2, 5, 14, 42, 132, 429, 1430, 4862, 16796];
if n < CATALAN.len() && self.g.is_empty {
return CATALAN[n];
}
Also, we can avoid checking the list of grudges in split_into_two
and move
that check to calc
, which will be much more efficient if we also sort the
vector of grudges. But these optimizations are not enough, we need some way to
decrease the depth of the recursion, avoid calculating similar things more than
once, we need cache!
struct Cache {
c: HashMap<Dance, u32>,
}
I created a dedicated cache struct to be able to control its behavior, we don’t want to run out of memory, which will happen if we keep all the entries. I was planning to remove the “cheapest to calculate” (few people) first, but it turned out to be unnecessary.
But we can’t use the Dance
as a key, because it’s not hasheable. It contains
a HashMap
of grudges, and that HashMap
cannot be hashed (since its elements
are not ordered, two hashmaps with the same elements would return a different
hash, unless we sort the elements before hashing but that’s very inefficient).
Luckly the BTreeMap
can be hashed, and it provides a very similar interface
so we can replace one with the other without many changes to the code:
#[derive(Clone, Debug, PartialEq, Eq, Hash)]
struct Dance {
p: u32,
g: BTreeMap<u32, Vec<u32>>,
}
Now add a few lines to the calc
function:
fn calc(&self, cache: &mut Cache) -> u32 {
// The cache is magic, it converts O(n!) into O(n^2)
if let Some(x) = cache.get(self) {
return x;
}
// [...]
let acc = acc as u32;
let clon = self.clone();
cache.insert(clon, acc);
acc
}
And the code is able to complete the submitInput! The runtime is pretty bad, about 3 minutes, so there must be room for optimization. I don’t know if the algorithm is actually O(n^2), that’s just something I read on stackoverflow while searching for optimizations.
If I had to rewrite the problem I would store the grudges and the cache in one
struct, and make all the recursive partitions on other struct, this way we
avoid cloning and modifying the list of grudges when splitting, and also we can
use a HashMap
to store the grudges, which is faster than a BTreeMap
. Well,
let’s just do that:
struct Dance {
g: HashMap<u32, Vec<u32>>,
cache: HashMap<DancePart, u32>,
}
impl Dance {
fn calc(&mut self, p: u32) -> u32 {
let x = DancePart { offset: 0, p: p };
x.calc(&mut self)
}
}
#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash)]
struct DancePart {
offset: u32,
p: u32,
}
impl DancePart {
fn calc(&self, dance: &mut Dance) -> u32 {
}
}
You can check out that alternative implementation: blog/p10_hashmap , and the run time is now only 1.2 seconds, a 150x speedup!
I have also tried a version which uses a Vec<Vec<u32>>
instead of a HashMap
to store the grudges, but the difference was very small, it was about 0.02s
faster.
P11
Diamonds
This was the last challenge I had time to begin, but I got stuck and wasn’t able to complete it.
The goal is to maximize the number of lasers with the restriction that two
lasers cannot cross on a diamond. A good starting point would be to set lasers
on all the rows, and leave the columns empty. Then, check if we can add lasers
to the columns without crossing any diamonds. So in the worst case, the
solution would be max(num_rows, num_cols)
. But the best solution won’t always
have a side full of lasers, sometimes we need to remove one laser to be able to
add two, like in this example:
ABCD
vv
1> **
2 **
3> *
4> *
We remove the laser on row 2, and add two lasers on columns C and D. I guess we could check all the possibilities: remove one laser, try to find a stable configuration, repeat with another laser. But the limits are N,M <= 500, so a naive approach won’t work.
Some trivial optimizations are removing empty rows and columns, which will always have a laser, and splitting the problem into clusters when some groups of diamonds are independent. In the example above, we can split the grid into two:
**
*
*
**
And then it becomes clear that the solution is 3+2 = 5. It would be nice if all problems could be easily solved using recursion, but this doesn’t look like it.
So let’s talk about the implementation. I decided to store the grid as a vector
of vectors of indices: Vec<Vec<u32>>
, instead of storing it as a boolean
matrix like a normal person.
**
*
*
In this example, the vector would look like this:
[
[0, 1],
[1],
[0]
]
And there are a few useful helper functions:
traspose
, which transposes the matrix (switch rows and columns).remove_empty_rows
removes the rows with no diamonds, to remove the columns instead we just need to transpose and remove rows.roc
sorts the matrix so that the clusters can be trivially found later, using the Rank Order Clustering algorithm, which is actually used to group machines by products in factories, but hey it works here too.find_clusters
takes the sorted matrix and divides it by the non overlapping clusters of diamonds, which can be solved independently.
Here’s an example of the output of the roc algorithm, can you see the 3 clusters?
**
**
* *
*
*
*
This simple clustering technique is almost enough when combined with
max(num_rows, num_cols)
as the score, it only fails 8 cases from the 50 in
testInput. So I decided to analyze these cases, and I realized something very
important:
Not all diamonds need a laser!
My algorithm would probably be the correct one if we need to secure all the diamonds, but it is never said that we need a laser on all the diamonds. For example:
ABCD
vvv
1 ****
2> *
3> *
4> *
Here if we leave the A1 diamond unprotected, we are able to put 6 lasers instead of the trivial 4.
And after taking that into account, I got stuck. The only clear idea I had was bruteforce, but I thought that there must be some clever way. Maybe turn the grid into a list of constrains: a diamond at A1 means that you either put a laser on row 1, on column A, or nowhere. Maybe it would be enough to find the largest rectangles with no diamonds inside the grid? Is there some way to solve this using recursion?
It was the last day of the challenge but I still had about 9 hours to work on this problem, and I managed to bruteforce the missing cases from testInput so I could see the submitInput, and the clusters were about 80x80, so better forget about bruteforce. I didn’t want to skip this challenge, because the next one had been skipped by many people, so it was probably one of these fun challenges like the one with the gameboy cartridge, where you don’t even know where to begin. And I felt like I got pretty far in this challenge, and I was only missing one small piece.
But in the end nothing happened.
P12
usb-ip
// TODO It turns out that writing is pretty time consuming, so come back in a week or so. I also plan on solving some more challenges, at least P12 which looks pretty fun, but the later ones will probably stay unsolved for a while.
Last updated 2018-05-25