Integration Testing with Entity Framework

At work we have a .NET codebase that is largely uncovered by unit tests. Currently we have about 20% coverage and unfortunately a large chunk of the untested code is integrated tightly with Entity Framework.

After searching the web for a while I realized that it was going to be very tough to unit test anything related to Entity Framework. The best I could hope for was writing integration tests.

I’ve come across the problem of doing integration testing against databases before and each time I’ve failed miserably. Two examples stand out in my mind:

Youthful attempts at MySQL integration testing

I tried to do integration testing with MySQL for a SaaS company many years ago. The steps went something like this for testing:

a. Restore a production backup for MySQL to a new instance.
b. Run a test against the database.
c. Check the diffs to make sure we wrote data correctly, updated data, etc.
d. Kill the instance.
e. Repeat from (a) for the next test.

The problem was that it took too long to restore the database for each test. Also the schema and data changed often enough that it ended up breaking the tests that depended on a static database.

Overall the approach was just too complex to manage. I was in a bit over my head at that time in my career.

More mature integration testing with Mongo and MySQL

Fast forward a few years and I found myself needing to do database integration tests with Mongo and MySQL.

This time instead of trying to create a live instance of MySQL, I came across the idea of creating an embedded database in memory that my tests would point to. For MySQL I ended up using H2 since the data types were fairly similar. I tried other databases, but things like Derby had compatability issues when trying to map data/schemas from MySQL.

For Mongo we used a project called “Embedded Mongo” and it worked pretty well.

Even though I successfully got both integration tests off the ground and working, I would consider the effort a failure. Why?

After several months of heavy development around the databases, I realized I had neglected the integration tests. I simply did not keep up with the maintenance of them and by the time I re-visited them, they were all broken.

The effort was a failure because I made the decision to abandon the integration tests instead of update them. The gap between reality and the tests had grown so large that I didn’t have the mental energy to fix the issues.

Entity Framework Integration Testing

Back to present day, and I find myself in a familiar place. I want to do integration tests against a database. I remember all the lessons I’ve learned from the past:

1. Make sure your tests can be updated easily in the future.
2. Try to use an embedded/in memory database.
3. Speed matters. You don’t want to create an entire database from scratch for each test.
4. Complexity kills. It’s easy to spiral out of control with complexity making the testing work 10x harder than the actual implementation.

After spending a few days waist deep in thought and neck deep in code, I’ve given up for the time being on doing integration testing with Entity Framework.

In a way I’m proud of myself because in the past I would easily spend (or waste I suppose) weeks trying to nail a solution. The older I get, the better I’ve become at cutting my loses. Better to abandon an approach entirely and come back to it later than be stubborn and let other projects suffer.

Some things I noticed from my attempts to test Entity Framework:

1. It’s confusing. The documentation, generated code, the articles online … nothing was intuitive to me.
2. Lots of hacks. The people that want to do integration tests around EF are getting really creative. I always get nervous when I don’t sense a community concensus around an approach.
3. SQL Server Compact Edition (CE) is a poor choice to mimic SQL Server. Lots of people suggested replacing a live SQL Server instance with the embedded version. While I was able to get this working, the schema discrepancies were too dramatic to make it feasible. Plus it was confusing how I could alternate between SQL Server and SQL Server CE in EF.
4. Complexity. It was just too complex. Yes there are tools to generate a schema that you can use to construct a database in SQL Server CE. Yes there are tools to generate data for SQL Server CE. Yesyou can automate all of this. No it cannot be easy.

Conclusion

The real issue at work might be that we’ve hidden a ton of Linq that ties directly to Entity Framework in one database access class. The design from the start didn’t have unit testing in mind and it might be too late with project deadlines, etc. to refactor.

I would agree with a statement I read online that it’s a poor substitute to do testing against Linq to objects because EF generates actual SQL from the Linq you’ve written. It’s best to test the actual SQL generation to find potential bugs.

One of these days I hope to find a solid approach that will span technologies for doing integration testing with databases.

Integration Testing with Entity Framework

Broken Windows

In the book Pragmatic Programmer the author outlines the theory of Broken Windows in regards to coding. Jeff Atwood has a nice article about the theory here.

Broken Window Theory: There was a study done that showed that when a building suffered a broken window, other windows were quickly broken and the building fell into disrepair. The theory is that if you leave something in a broken state – or neglect a flaw – the system as a whole will quickly degrade. The Pragmatic Programmer book argues the same thing happens with code.

Broken windows in regards to code look like the following:

1. Comments that have become wrong/incorrect.
2. Conventions being broken – for example inconsistent names for objects.
3. Lazy implementations that could easily be improved upon.
4. Unorganized code and data structures.

The idea is that as you work with a codebase, if you see some of the things above, you are likely to believe no one cares for the code anymore. There are broken windows all over, so why try hard to write good code? Why try to refactor existing systems and classes? Why do extensive testing?

Lately I’ve found myself falling into this mindset when working with legacy code. At the end of this article I’ve listed a few broken windows I’ve found in the past few months. At a low level in my brain I find myself making excuses for taking shortcuts with my code. There are so many broken windows, so why not skip testing everything I touched?

My resolution is that even though I might be working with legacy code, I need to uphold the same standards I would have for new code.


Some examples:

1. A comment in code:

// UN-COMMON THIS LINE WHEN READY TO ROLL

2. A comment in a stored procedure in the database:

// COMMENTS: What is IndividualDates all about ? 

3. We need to be using source control:

// DISABLE CACHING FOR NOW 9/26/06
// Check cache to see if cache exists

4. Justin needs to test:

//Added true so the menu is always fresh
//You can remove....but justin wants to test the menu

5. Updating!

            if (e.CommandName == "Up")
            {
                if (indx > 0)
                {
                    newCat = lCats[indx - 1];
                    upDateDB = true;
                }
            }
            else
            {
                if (indx < lCats.Count - 1)
                {
                    newCat = lCats[indx + 1];
                    upDateDB = true;
                }
            }

6. Good luck reading that for loop.

            for (int i=0;i<lCats.Count;i++)
                if (lCats[i].MenuCategoryID == mCatID)
                {
                    indx = i;
                    oldCat = lCats[i];
                }

7. Long, long, long. Found a 700 line function to initialize a web forms .aspx page. 850 lines total in the class, so about 82% of the code in this class is dedicated to one method!

Broken Windows

Stubbornness with Tools

For my entire computing life I’ve been slow to adopt new tools. I always end up learning a program or tool and then stick with it no matter what.

Some quick examples:

Purpose Old/Comfortable Huge Advance
Writing Java vim Eclipse (even though I generally hate Eclipse)
Operating System DOS Windows 95
Html/Javascript Notepad Notepad++
Javascript Pure Javascript jQuery
Prototyping Temp .NET Projects LinqPad

The latest entry is using LinqPad to do quick C# prototyping. In the past if I wanted to test an algorithm or some simple function, I would create a new project, call it “Sandbox” and start coding. It would take several minutes to make the console project, make a bunch of static methods and then test. Then I’d have to open a CLI prompt and run it. Bleh.

With LinqPad I can start just banging out C# and run it without all the extra “stuff”. Awesome.

I heard about LinqPad years ago, but refused to look into it. Like all the other entries in my list above, it really comes down to pride.

Deep down in my brain there is this thought, “You’ve wasted a ton of time by not using XYZ tool!” It’s hard to admit that for years I wrote raw Javascript to accomplish tasks that jQuery could have done far better. It’s also hard to admit that I would notice other developers using Notepad++ and thought, “Ha! So weak. Notepad++. I don’t need that.” Now I can’t imagine being without it!

The lesson for me is to be quicker to adjust to new tools that can make me more productive.

Stubbornness with Tools

String Searching Analysis – Rabin Karp versus Naive Search

I recently finished an algorithm problem that involved implementing the Rabin Karp search algorithm.

After reading several articles about it, I thought it would be fun to run some tests to prove how much faster it is than a naive search algorithm.

Unfortunately most of my tests show that it is actually far worse than two naive/brute force approaches that I used. This surprised me because on TalentBuddy my initial naive approaches were too slow to pass their tests. Only after implementing Rabin Karp could I get the tests to pass.

I’ve been anxious to start using R to visualize data, and this was a great opportunity to do so.

Benchmark details:

1. For each graph, the tests were run 10 times back to back and the graphs show the mean of each algorithm.
2. The random data was generated from an English dictionary.
3. 100,000 bytes were taken from the end of the randomly generated file and this was used as the search pattern. Then I concatenated the random file together four times to make the pattern appear four times.
4. It was a bit hard to find long common phrases in the Bible and Shakespeare.
5. The slow behavior of Rabin Karp seemed to take place in the “sliding” part of the algorithm. When it moves up one character to do a comparison … that loop was the slowest part.

Naive 1 = My best attempt at a naive search.
Naive 2 = http://www.geeksforgeeks.org/searching-for-patterns-set-1-naive-pattern-searching/
Rabin Karp = http://www.geeksforgeeks.org/searching-for-patterns-set-3-rabin-karp-algorithm/

results4

results3

results2

results1

I used this article to generate my random word file.

All my code, scripts, data, etc. can be found here: https://github.com/ryan1234/talentbuddy/tree/master/mapreduce/hashstring/analysis/hash-string

String Searching Analysis – Rabin Karp versus Naive Search

Debugging War Story – 8 hours with iTunes

The other day I got an email saying that a new production bug had come up and it was my turn to take it.

The Problem

We have a bunch of podcasts at work and each is powered by an RSS feed. An example URL looks something like this:

…/mc/PodcastStream.aspx?p=yDi0V4EwP58=&iid=Hc0ccBgnSdU=&site=yDi0V4EwP58=

For some reason iTunes dropped our podcast and when we tried to re-submit it, we got this error:

itunes podcast error parsing feed: invalid xml: error on line 1: content is not allowed in prolog.

Thus the podcast is dead in the water and even though the URL looks ok, iTunes rejects our RSS feed.

Initial Investigation

The technology stack here is C# (.NET 4.0) and IIS. The code is a straightforward .aspx page that generates XML and writes it to the HTTP response stream.

Things I tried quickly:

1. I checked the byte order marker for the body of the HTTP response. I removed the BOM that I found and it still failed.
2. I tried encoding the HTTP response as ASCII instead of UTF8. Nope.
3. Messed around with the ContentType HTTP header. Tried “text/xml” (which apparently should default to ASCII), “text/xml; charset-utf8” and “text/html”. Interesting work, but none of these worked.
4. Copied the IIS logs to my local machine so I could see iTunes requesting the RSS feed page. I found it easily and it seemed we were serving the content fine. Lots of 200 responses. (I noted that iTunes was doing a HEAD and then a GET. Something to look at later?)
5. Created a static, good working version of the RSS feed. iTunes accepted this, but it wasn’t dynamic, so newer podcasts wouldn’t be seen.
6. Ran my good and bad RSS feeds through the RSS feed validator website. (http://feedvalidator.org/)

Having failed to hit low hanging fruit I knew I had to dig deeper.

A deeper dive

I try not to get too complex too fast when troubleshooting, but sometimes I give in to the “hero” work. I want to find a really nasty bug so I can say that I solved something difficult. I once read an article about a guy that proved a cosmic ray flipped a bit on his box! (https://blogs.oracle.com/ksplice/entry/attack_of_the_cosmic_rays1).

A more serious approach:

1. I compared the HTTP headers of the valid static request/response to the failed ASPX request/response. There were some minor differences, but nothing that stood out.
2. I moved the successful base case on to the production web server where the failing RSS feed was. This would help eliminate IIS as the issue. Sure enough, the static RSS page was fine and the dynamic one was bad.
3. I needed to iterate quickly with changes so I moved all the C# code from a .cs file to the .aspx file. This was fantastic because I could make changes without having to recompile and deploy a new DLL. Since I was working in production I made a /temp/ directory where I could work without risking some sloppy error.
4. Now I had two files on production and one was failing and one was ok. The bad one was an .aspx file and the good one was an .html file. Next up was to change the .html file to an .aspx, so I did that and had the C# simply read a text file from disk and serve that.
5. Loaded up a hex editor to compare the output from the bad feed and the good feed. They were the same!
6. Went to my Ubuntu VM and did a bunch of curl tests. Who knows what environment iTunes was running to make the calls. curl reported nothing interesting. It seemed to be the same.

Desperation

At this point I’ve taken out a hex editor. I’ve moved to Linux to get more serious with curl. I’ve begun working on production to do quick tests. Things have gone from serious to desperate.

To summarize, at this point, this is what the good feed and the bad feed had in common:

1. Same HTTP headers for requests and responses. Byte for byte, the same!
2. Same content in the HTTP response body. The same even down at the byte by byte level!
3. Same URL and querystring.
4. On the same server in the same directory.
5. Both running through IIS.

Now my brain was spinning. Same headers? Same responses? Ugh.

Out came Wireshark on my Ubuntu VM. What I found:

1. All TCP traffic was the same between the good and bad feed.
2. The only difference was that the bad feeds consistently seemed to hit one extra domain controller on our network. The good feeds never hit this machine. It turns out this was a major dead end and I’m stoked I didn’t fall for it.
3. All TCP packets were the same, meaning each byte was the same between the two requests. (At least the same for things that mattered.)

Resolution!

The big break came when I started to take the code apart for the broken feed line by line. I commented out large sections and quickly narrowed it down to one section of code.

            if ((!(Request.QueryString["s"] == null)
                        && ((Request.QueryString["s"].Length > 0)
                        && !(enc.Decrypt(Request.QueryString["s"]) == "Error"))))
            {
		...
            }
            else if ((!(Request.QueryString["p"] == null)
                        && ((Request.QueryString["p"].Length > 0)
                        && !(enc.Decrypt(Request.QueryString["p"]) == "Error"))))
            {
		...
            }
            else
            {
                Response.Write("No S or P in querystring");
                Response.End();
            }	

The code looks at the query string and checked to see if certain values are there. If they are, it tries to decrypt the values and then parses the values into integers.

I quickly realized that for some reason we were always hitting the Response.Write(…) line only when iTunes hit the URL.

If I hit the URL in Chrome, Firefox, via curl … all these returned fine. But when iTunes took the URL, it failed.

The reason ended up being that iTunes lower cases the URL before checking it. I pasted in the following:

…/mc/PodcastStream.aspx?p=yDi0V4EwP58=&iid=Hc0ccBgnSdU=&site=yDi0V4EwP58=

which iTunes turned into:

…/mc/podcaststream.aspx?p=ydi0c4ewp58=&iid=hc0ccbgnsdu=&site=ydi0c4ewp58=

Of course trying to decrypt a case sensitive string that has been lower cased fails.

End, end, end

The fix in the end was to remove the encrypted values from the query string (they weren’t sensitive anyways) and resubmit to iTunes.

All is well now and eight hours has been almost entirely wasted.

Debugging War Story – 8 hours with iTunes

CodeChef problem: PREDICT

The Problem

Each month CodeChef runs a coding competition in which I try to solve at least two problems. Usually the easiest two are simple and can be written and verified in a few hours each.

The problem that had me a bit stuck this month was one involving probability. I haven’t done much work with probability since college (over 10 years ago), so I had to brush up on some ideas.

The outline of the problem (with an example):

1. Two teams are playing each other, team A and team B.
2. The probability that team A wins is pX. (For example, 0.51, meaning a 51% chance that team A wins.)
3. The probability that team B wins is 1 – pX.
4. We will always bet 10,000 dollars between the two teams.
5. We can bet on both teams at once.
6. If a team wins, then the money made is calculated by (2 * (1 – pX)) * x, where “x” is the amount we bet. For example, we bet 6,000 on team A, who has a 51% chance of winning … if they win, we make (2 * (1 – 0.51)) * 6000, which is 5,880 dollars.
7. If a team loses, then we lose the amount we bet. If we bet 4,000 dollars on team B and they lose, we’ve only lost 4,000 dollars.

The problem is to find the optimal amount to bet on each team so that we maximize our money.

This means that if I bet 1,000 on team A and 9,000 on team B, I make X amount of money. If I change the amount for team A from 1,000 to 2,000 (and subsequently the team B amount from 9,000 to 8,000), then I’ll make Y amount of money. We just keep changing the amounts until we find one that we know is the highest.

Thought Process

Whenever I read the word “optimal” I assume the problem is going to be difficult. Specifically with this problem I thought it was going to mean I would come up with a function, f(x), where I would have to find a maximum – or a place where the slope was 0. It made me think the problem was harder than it was.

I tried to do some math based on the problem details (http://www.codechef.com/JUNE13/problems/PREDICT) and could not seem to generate the example shown.

This lead me to Google around for betting formulas which lead me to the Kelly Criterion – http://en.wikipedia.org/wiki/Kelly_criterion. Aha! Complex math and formulas! I’ve done CodeChef problems before where a formula was at the heart of a problem, so I assumed this must be my answer.

Long story short, the Kelly Criterion was a dead end. I could never make it work given my variables, so I assumed it didn’t match my problem space.

Progress…

The real break through came when I sat down and went slowly and thought through the problem. I had gone through the problem before, but this time I tried not to skip steps.

The example on the website had two inputs and one output. Eventually I worked through their example by hand so that I could re-produce their result. The next step was to generalize the formula. This is the result:

p = 0.51
x = 6000
total = 10000
result = ((total + ((2 * (1 - p)) * x) - (total - x)) * p) + ((total + (2 * (1 - (1 - p))) * (total - x) - (total - (total - x))) * (1 - p))

When I printed the result I got 10018, which matched perfectly.

I decided to use Python to solve this problem because it’s quick and easy to get going and several people in the comments on the site had mentioned that Python gave easy/accurate floating point accuracy which was important.

Surprising Results

Now that I had a formula that worked in one case, it was time to graph some general results. I just ran the formula through a loop and graphed the output. This is what I got:

chart2

It’s linear. I couldn’t believe it, so I picked some other probability for team A and ran my test again. I thought I’d see a curve somehow, but got this:

chart1

Not a curve. I suppose it shouldn’t surprise me since the formula earlier shows no possibility of being a curve.

Since it was linear, this meant that based on the percentage of winning, I would either bet 0 dollars or all 10,000 on team A. Then it is simply a matter of plugging that into the formula to find the optimal amount.

Conclusion

This took me a lot longer than it should. Each time I struggle with a problem like this I learn to take my time and patiently work through a problem. I also learn to not make assumptions and follow rabbit trails.

Solution: https://github.com/ryan1234/codechef/tree/master/jun-2013/predict

CodeChef problem: PREDICT

Dead Ends and Wasted Time

It’s easy to write about success, but not so easy to write about failure. Not just failure either, but unresolved failure.

Some quick specifics:

1. At work I tried to introduce a new library to a .NET project. Specifically the Thinktecture.IdenityModel.dll library.
2. In its assembly manifest it wants version 4.5.0.0 of the Newtonsoft.Json.dll.
3. My project references lots of other projects with really, really, really old versions of Newtonsoft.Json.dll.
4. When old meets new, I get errors that .NET can’t find the 4.5.0.0 version of Newtonsoft. The older version has won and is used across the board.

Unfortunately I lost an entire day trying to resolve this issue. I’m pretty sure some engineer out there would have a clean and clear solution for it, but I couldn’t find that guy online. =)

Things I tried:

1. Merging all DLLs Thinktecture needs into one mega DLL with ilmerge.exe. Didn’t work, but man ilmerge.exe was fun to work with. Good to know it exists.
2. Upgrading the references to the old Newtonsoft.Json.dll. Didn’t work because there are breaking changes between old and new (~2.0 and 4.5).
3. Explicitly referencing the new version in the web.config for my project. I found all sorts of ways to do this online and none of them worked.
4. Turning on assembly loading tracing. This was also really fun and interesting. Super cool to see how .NET is loading assemblies. Ultimately no help though.

When I’m stuck in a maze I tend to only go back 2-3 steps and then try a new approach. After burning through an entire day though, it’s time to take 10 steps back and try a different path.

Dead Ends and Wasted Time