Oops x1.6 million

Recently I accidentally sent out 1.6 million emails to customers. I was doing local development that integrated with a third party and didn’t realize my iterative testing was actually firing off emails in production.

The vendor doesn’t have a staging/sandbox environment so extra care is always needed when doing development.

It was a pretty embarrassing situation and I tried to take a few things away from it:

1. If you are integrating with a vendor and they only have production, find a way to mock their environment.
2. Always create safe data structures for testing. Create duplicates with the third party that mimic real data structures, then swap to live objects when going live.
3. Be careful! There should be a weighty, important feeling to doing work with integration. If you’re feeling very casual and loose, then something is wrong.
4. Put in safe guards. In this case I made a whitelist of objects that could be operated on remotely. I also put in place several business rules that would prevent mistakes in the future.

Big mistakes mean opportunities to learn big.

Oops x1.6 million

Responsibility and Authority

At my last job I was the default MongoDB administrator. I lived in the Mongo shell all day for almost a solid year and I definitely used it more than any other developer, so that meant all things Mongo came to my inbox. Someone needs a document (or series of documents) updated? Send it to Ryan. Someone needs a new collection, needs data migrated from one server to another, needs to test connectivity, needs data exported? Send it to Ryan.

As our platform became more stable and more websites were deployed on top of it, Mongo began to wince a bit with the added activity. We started having odd issues with our production replica set and of course part of the investigation landed on my lap.

The point of this post is that even though the responsibility for Mongo was obviously mine, the authority to do real work was not. I could work with production data, but not SSH to the production machines.

How was I supposed to successfully see what might be causing issues in production if I couldn’t get on the box and look at the logs? How could I watch a process or look at the network activity? I couldn’t. I wasn’t given the authority to connect and do my work.

I’ve seen this happen several times in my career. I end up responsible for things, but have no power to do what I need to do in order to fix problems.

I’m happy to say that I’m getting better at identifying situations where I’m stuck being responsible without any authority to affect change.

Responsibility and Authority

Azure Fatigue

The battle rages on with Microsoft Azure. Having come from a solid year of working with Amazon’s EC2, I find working with Azure to be very frustrating at times.

A recent example is how the IP addresses – both public and private – change on instances.

For example, I have a cloud service that is up and running with a production instance and a staging instance. The private IP addresses end with .11 and .12.

I deploy my project to the staging instance and then swap staging and production once I have verified that staging is good.

What is the state of things now?

Now I have private IP addresses that end in .10 and .12. The address that ended in .11 is gone and now I have an IP with the .10 address that is not allowed to communicate with other machines since it is unrecognized.

I remember going through similar pains with EC2 (getting machines to communicate internally), but at least EC2 kept things consistent. The IPs would stick on a box that wasn’t stop and restarted. If I just rebooted a machine, the IP address would stay the same and I didn’t have this tight deployment integration that does “magic” for me.

Azure Fatigue

Linqpad Azure Storage Driver

Background

At work we’re invested heavily in Microsoft Azure. We have stayed very true to the entire Azure stack and as a result we’re using their table storage to store logs.

The big idea of course is that whenever something happens in our code, we just write it to a table in Azure.

Problem

I’m sure after a year or so I might like Azure’s table storage, but for now I hate it. Even as I’m writing this, I can’t believe there are pretty much _no_ tools to query or browse the contents of a given table.

There is a project called Azure Storage Explorer, but it stinks. It’s incredibly slow and isn’t the most intuitive UI.

There is also a Linqpad driver, but the functionality is minimal.

For me, the root problem is that Azure table storage – when queried – will only return one page of data at a time. Each page is 1,000 records. From my experience you can’t even be guaranteed that the first 1,000 you get is the most recent 1,000.

All I want to do is see all the data stored!

Solution

The guy that wrote the Linqpad Azure storage driver put it up on Github, so in true open source spirit, I forked it!

In a little less than an hour I had modified the driver, tested it in Linqpad and updated the code on Github.

I’m so, so, so excited now that I can get all log records in Linqpad.

https://github.com/ryan1234/AzureStorageDriver

Linqpad Azure Storage Driver

Project Euler!

Well it appears I am quite late to the party. The other day while trying to solve a CodeChef problem I stumbled across Project Euler. It was right up my alley as I’ve been trying to get into sites that help me practice algorithms and math.

Briefly some notes after my first 37 problems:

1. Project Euler encourages me to write really bad code. Since there are no time or memory constraints (just give us the right answer), I end up writing the worst possible code to brute force answers.
2. I love Python. What else is there to say? There is a reason Python is one of the most popular languages for solving Project Euler problems. String parsing, math functions, the terseness of the language, the ability to easily handle large numbers … it’s just fantastic.
3. I am about done with C++ for problems. The syntax and constant Googling of how to use various data structures became too annoying for me. Plus the extra few steps of having to compile and then run … it’s not that much, but it adds up. Again I love just running Python code without having to compile first.
4. It’s addicting. The site was made for someone with my personality. Goals, levels, competition, etc.

I read a blog the other day of a guy talking about how Project Euler was boring. He had solved about 150 problems (he even contributed a problem!) and then got tired of the site. Something along those lines.

Well I plan to beat that guy. We’ll see how it goes. =)

Project Euler!

Best Practice Exhaustion

I’m sure everyone has days like this – you feel a little worn out physically and exhausted mentally. I always know I’m extra fatigued mentally when I lose all interest in my work. Just like when I was racing mountain bikes, some days your mind refuses to go out and train. It’s too much. The idea of shirking your responsibility and doing anything except what you are supposed to do is irresistable.

While considering how I came to this point of fatigue, a new thought came to me in regards to development.

From time to time I believe I experience best practice exhaustion.

Work is filled with people that know best practices, but any idiot can read and remember rules. Most best practices start with the words “always” and “never”. Some examples:

1. Always check in your code to version control.
2. Always write unit tests.
3. Always do code reviews.
4. Never let an unanswered email fester.
5. Never let someone else drop the ball on a project.
6. Never use XYZ data structure, ABC pattern.

In software, it’s easy to compile a long list of best practices. Things that every good developer will always do.

Here are some implications of best practices:

1. You are responsible for any and all problems with a project.
2. If other people are lazy/non-responsive/not professional, that’s your fault too.
3. If a bug appears, and you could have prevented it, you’re guilty. Why didn’t you follow best practices?
4. If you’re exhausted at your desk and decide to stare aimlessly people wonder, why aren’t you working?
5. Not only are you responsible for coding, but you are responsible for project management, people management, specifications, deployments, etc.
6. Your velocity is always measured by code only and not these other factors.

So I find myself sometimes stuck in a bit of a rut. I take a task at work, I think about it, start developing and this laundry list of do’s and dont’s floods my thought process.

“Have I checked in every piece of code I’ve ever written? Even one off scripts? Even trivial database scripts?”
“Did I release something without a code review?”
“Did I log my time perfectly in XYZ tool?”
“Does everyone always know what I’m doing at all times? Did I send enough emails?”
“Have I thought about every possible boundary/use case no matter how obscure?”

I could go on and on and on.

The bottom line that summarizes this post is the statement: You never have any excuse for not doing something perfectly.

At least that’s how software engineering feels at times. I have found little grace and forgiveness amongst co-workers for simple human error.

We’ll see if there are solutions to the “no excuses” problem going forward.

Best Practice Exhaustion

Code Chef Problem – Distinct Prime Factors (KPrime)

Whew. This problem took me the better part of a week to solve. The first few days I misunderstood the problem and went off on a bit of a rabbit trail. Even though I have consistently learned to not rush head long into solving a problem before properly understanding it, I keep repeating this mistake.

The Problem

The problem is in two parts:

1. Find the number of distinct primes for a number n. (For example, 10=2*5, so that is 2 distinct primes. 20=5*2*2, so again, 2 distinct primes)
2. For a range of numbers, for example, 1 to 100, how many numbers have k number of distinct primes?

An example:

If we consider the numbers 4 to 10, with k=2. The question is … 4,5,6,7,8,9,10 … out of those numbers, how many have 2 distinct prime factors?

The answer?

– 4=2*2, so it has 1 distinct prime factor
– 5=5*1, so it has 1 distinct prime factor
– 6=2*3, so it has 2 distinct prime factors
– 7=7*1, so it has 1 distinct prime factor
– 8=2*2*2, so it has 1 distinct prime factor
– 9=3*3, so it has 1 distinct prime factor
– 10=2*5, so it has 2 distinct prime factors

The answer must be 2 then. 6 and 10 have two distinct prime factors.

The constraints were that the range could go up to 100,000 and k could range from 1 to 5.

First Swings

Initially I thought of actually generating prime number combinations. For example, if k=3, I would generate [2,3,5], [3,5,7], [5,7,11], etc. The idea being that order doesn’t matter and numbers can’t be repeated. The problem was that this would only yield numbers that had uniquely distinct prime factors. In other words, [2,2,3,3,5,5,5] also have three distinct prime factors, so in that sense it is the same as [2,3,5].

I abandoned this approach after trying it out for way too long.

My Good Friend OEIS

After Googling around a bit I ended up on this page: OEIS A001221.

For algorithm competitions, OEIS (Online Encyclopedia of Integer Sequences) is invaluable.

Unfortunately there were no code samples for generating k-distinct primes on the OEIS page. I tried to read the math specifics on Wikipedia and various other math sites and found the function I was looking for was omega(n). Even though I knew the function, I couldn’t find any good implementations of it.

New Approach

I did actually see something on OEIS that gave me an idea. There was a MAPLE implementation that inspired me to do this:

1. Loop over all the primes under 100,000.
2. For each prime, go through the multiples of it within a limit. (The limit ended up being ceil(100000 / prime(i)) )
3. For each number in (2), we add 1 to an index in a vector. For example, if we’re looking at 7 for the prime, we would have:
a. nums[7*1] = nums[7*1] + 1
b. nums[7*2] = nums[7*2] + 1
c. nums[7*3] = nums[7*3] + 1
d. …

In this way I could create a vector that cached the distinct prime count for numbers 1 to 100,000. Then when a range was given to me, I could iterate over the vector and count how many times a given number appeared.

Temptation to Cheat

At one point in my search for an implementation – or at least good written explanation – of omega(n), I came across this page: http://stackoverflow.com/questions/17545888/number-of-distinct-prime-factors-of-a-number.

A guy was doing the same problem and gave up and asked Stack Overflow to finish it for him. Not cool.

The temptation was there to just give in after a week of work. I could just read the answers on SO and submit the problem and be done with it. However I hadn’t spent all that time just to quit and cheat.

First Submission

My first submission failed from the time constraint. A few weeks ago I made a subtle transition over to C++ for algorithm competitions because I had to use it for a problem on TalentBuddy. The language is still fairly new to me, so when my first submission failed I wasn’t surprised. I must have missed some major performance issues with my code.

The good news was that I didn’t get a “Wrong Answer” reply.

Micro Optimizations Work!

Every single time an algorithm has failed on CodeChef, micro optimizations have not worked. I’ll try minimizing variables, making loops shorter, etc. and it never works. There is always a major issue with the algorithm that only a major re-factor can fix.

For the first time, I was able to do micro optimizations and it worked.

I did some performance profiling and was shocked that my prime generation function only took 2ms to run. Moreover, the caching of k-distinct primes only took 23ms! 25ms out of 1000ms allowed was spent caching the data I needed.

That meant looping through the vector of k-distinct primes was too slow! This really surprised me because I figured that code was as good as it could get.

The big change was accessing a vector by index instead of using .at().

nums.at(i)

… became …

nums[i]

Just like that my vector caching of k-distinct primes went from 23ms to 10ms. My submission finally worked!

Conclusion

C++ is very powerful and micro optimizations add up. Over time I figure I’ll learn more and more of them. I’m excited to see how other people solved this problem.

Code: https://github.com/ryan1234/codechef/tree/master/jul-2013/kprime

Code Chef Problem – Distinct Prime Factors (KPrime)