Two Guys Arguing

Mountains of Obfuscation

Posted in challenge by benjaminplee on 02.26.12

What would you do?

You have a couple hundred files across several nested folders that contains within them 10000+ obfuscated random and unique user names.  You also have a translation file of all of the obfuscated names and new ones you ned to have them replaced with.  Some files have all 10000+ obfuscated names in them.  Some have none or just a couple.

How would you update all of the files?

Obviously this isn’t a crazy hard problem; there are many ways to solve it.  I know how I solved it.  I am curious as how YOU would solve it.

Go!

 

[edit: added clarification]

2 Responses

Subscribe to comments with RSS.

  1. Heath Borders said, on 02.26.12 at 11:00 pm

    The simplest way first. Read the translation file into memory, create a map from old to new uids. Scan all the folders looking for uids and replace them.

  2. Matteo Casalino said, on 12.06.12 at 7:54 am

    assuming the translation file to be a list of tab-separated pairs of strings:

    sed -e ‘s/\([^\t]*\)\t\([^\t]*\)/s\/\1\/\2\/g/g’ translation_file > script
    fgrep -rlZ “$(cat translation_file | cut -f 1)” root_dir | xargs -0 sed -i -f script


Leave a comment