IdentityServer 2.0.1 on Mono

At my current gig we use IdentityServer to help us manage security for our API. Our aim is to run quite a few virtual machines in the cloud, so pricing is key. Linux machines are much cheaper than Windows boxes, so getting the IdentityServer running happily on Mono is crucial.

Requirements

This blog assumes you are running a Debian based Linux distro. This example specifically is using Debian/Jessie. The concepts should easily translate though to other distributions including Ubunutu.

For quick iterations, I would also recommend running a local VM via Vagrant or Docker. It makes testing from scratch much easier. (Make sure to open up port 44319 if you go down this route!)

Installing Mono

Installing Mono can be surprisingly tricky. The package references change from time to time. The following works at the time of the blog posting:

# Installing Mono 4.0.4
sudo apt-key adv –keyserver hkp://keyserver.ubuntu.com:80 –recv-keys 3FA7E0328081BFF6A14DA29AA6A19B38D3D831EF
echo "deb http://download.mono-project.com/repo/debian wheezy main" | sudo tee /etc/apt/sources.list.d/mono-xamarin.list
echo "deb http://download.mono-project.com/repo/debian wheezy-libjpeg62-compat main" | sudo tee -a /etc/apt/sources.list.d/mono-xamarin.list
sudo apt-get update
sudo apt-get install -y mono-devel ca-certificates-mono fsharp nuget curl unzip git

view raw
install-mono.sh
hosted with ❤ by GitHub

vagrant@vagrant-debian-jessie:~$ mono --version
Mono JIT compiler version 4.0.4 (Stable 4.0.4.1/5ab4c0d Tue Aug 25 23:11:51 UTC 2015)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
	TLS:           __thread
	SIGSEGV:       altstack
	Notifications: epoll
	Architecture:  amd64
	Disabled:      none
	Misc:          softdebug
	LLVM:          supported, not enabled.
	GC:            sgen

Installing DNX (beta7)

Use the following (taken from the Docker repo) to install DNX beta7:

(Run each command as root)

export DNX_VERSION=1.0.0-beta7
export DNX_USER_HOME=/opt/dnx
mkdir -p /opt/dnx
curl -sSL https://raw.githubusercontent.com/aspnet/Home/dev/dnvminstall.sh | DNX_USER_HOME=$DNX_USER_HOME DNX_BRANCH=v$DNX_VERSION sh
bash -c "source $DNX_USER_HOME/dnvm/dnvm.sh && dnvm install $DNX_VERSION -a default && dnvm alias default | xargs -i ln -s $DNX_USER_HOME/runtimes/{} $DNX_USER_HOME/runtimes/default"
export PATH=$PATH:$DNX_USER_HOME/runtimes/default/bin

view raw
install-dnx.txt
hosted with ❤ by GitHub

root@vagrant-debian-jessie:~# dnx --version
Microsoft .NET Execution environment
 Version:      1.0.0-beta7-15532
 Type:         Mono
 Architecture: x64
 OS Name:      Linux

Installing libuv

Libuv is the node.js asynchronous I/O library that Kestrel is based off of. Kestrel needs this to serve the IdentityServer over HTTP.

(Run each command as root)

apt-get -qqy install autoconf automake build-essential libtool
export LIBUV_VERSION=1.4.2
curl -sSL https://github.com/libuv/libuv/archive/v${LIBUV_VERSION}.tar.gz | tar zxfv – -C /usr/local/src && cd /usr/local/src/libuv-$LIBUV_VERSION && sh autogen.sh && ./configure && make && make install && cd ~ && rm -rf /usr/local/src/libuv-$LIBUV_VERSION && ldconfig

view raw
install-libuv.txt
hosted with ❤ by GitHub

Installing IDP

We are going to take a simple example from the IdentityServer Samples repo and start it up. There are a few minor changes that need to be made, so make sure to read the last section before trying to start the identity server.

(Run as root)

mkdir -p /var/lib/dnx/idp3
cd /var/lib/dnx/idp3
git clone https://github.com/IdentityServer/IdentityServer3.Samples.git
cd ./IdentityServer3.Samples/source/AspNet5Host/src/IdentityServerAspNet5
dnu restore

view raw
install-idp.sh
hosted with ❤ by GitHub

Now we need to edit project.json to add the Kestrel package and a command to run Kestrel.

Add a dependency for Kestrel so that your dependencies look like this:

  "dependencies": {
    "Microsoft.AspNet.Server.Kestrel": "1.0.0-beta7",
    "Microsoft.AspNet.Server.IIS": "1.0.0-beta7",
    "Microsoft.AspNet.Server.WebListener": "1.0.0-beta7",
    "Microsoft.Owin": "3.0.1",
    "Microsoft.AspNet.DataProtection": "1.0.0-beta7",
    "Microsoft.AspNet.Owin": "1.0.0-beta7",
    "IdentityServer3": "2.0.1"
  },

Also add a command to start Kestrel so that your commands look like this:

  "commands": {
    "web": "Microsoft.AspNet.Hosting --config hosting.ini",
    "kestrel": "Microsoft.AspNet.Hosting --server Microsoft.AspNet.Server.Kestrel --server.urls http://localhost:44319"
  },

Finally we need to edit Startup.cs.

using Microsoft.AspNet.Builder;
using Microsoft.Framework.DependencyInjection;
using System.Security.Cryptography.X509Certificates;
using IdentityServer3.Core.Configuration;
using Microsoft.Dnx.Runtime;

namespace IdentityServerAspNet5
{
    public class Startup
    {
        public void ConfigureServices(IServiceCollection services)
        {
            services.AddDataProtection();
        }

        public void Configure(IApplicationBuilder app, IApplicationEnvironment env)
        {
            var certFile = env.ApplicationBasePath + "/idsrv3test.pfx";

            var idsrvOptions = new IdentityServerOptions
            {
                Factory = new IdentityServerServiceFactory()
                                .UseInMemoryUsers(Users.Get())
                                .UseInMemoryClients(Clients.Get())
                                .UseInMemoryScopes(Scopes.Get()),
                RequireSsl = false,
                SigningCertificate = new X509Certificate2(certFile, "idsrv3test")
            };

            app.UseIdentityServer(idsrvOptions);
        }
    }
}

I changed the path for the pfx file so that Linux was happy with it, and I added the RequireSsl flag to make testing easier up front. You’ll of course want to take this out later!

Running the IDP

Finally we are ready to start the identity server.

Simply run dnx kestrel and you should be set!

root:/var/lib/dnx/.../IdentityServerAspNet5# dnx kestrel

You can now browse to http://localhost:44319 to see the welcome page of the application.

IdentityServer 2.0.1 on Mono

HDInsight – Log storage attempt #1

I plan on doing a series of posts describing my attempts to get data into HDInsight which is hosted in Azure.

At my current job there is a need to do business intelligence reporting. We are just starting to investigate and plan out a data platform where we can store an arbitrary amount of data and then run reports on it later.

Some of the informal requirements we have for the new data platform are:

1. Store arbitrarily large amount of data. We don’t want to worry about deleting data and we want everything to be historical.

2. Cheap. The storage needs to be cheap. We want to minimize the cost of scaling out nodes as our data grows.

3. Dynamic. This is an important requirement. We don’t want to have to manage a rigid schema. We will constrain the “records” to only appending new data. Ideally we only add fields, we don’t change or remove old ones.

Candidate Data

The candidate data set to get a proof of concept off the ground are 404 logs. If someone comes to our site and the page isn’t there, we write that data to the data platform.

Hadoop felt like a natural fit because it has been around for a long time and has a great eco-system around it. We would store the 404 logs in HDFS and then run map/reduce jobs to report.

In my first proof of concept I got a small Linux VM up and running with Hadoop and sent 404 logs to HDFS via WebHDFS.
It worked great and the 404.log file in HDFS would grow and grow as pages couldn’t be found on our website. The key note here is that I was only appending lines to the end of a file.

The next big step was to make it production ready.

Going to Production

In Azure it is clearer cheaper to spin up an HDInsight cluster instead of running my own cluster of Linux VM’s running Hadoop. That is great news, but how do I get data into HDInsight?

The Azure answer is to put the data into blob storage.

This is where the roadblock went up. The blob storage APIs do not allow for the appending of data to existing files. You have to do things like queue up writes and then make blocks and append blocks (complicated) or download an entire file, append a line and then re-upload.

I scoured the web for how to append small amount of data to blobs and couldn’t find any good answers. I looked through log4net code (that writes to blob storage), I tweeted everyone I could find related to blob storage and HDInsight and … no good replies.

This might be a good time to challenge my assertion that appending data is good. Why do that?

The word on the street is that Hadoop prefers a small amount of large files instead of a large amount of small files for running map/reduce jobs. Because of this I want to be able to cheaply add data to large files.

A backdoor?

Eventually I found two .jars (https://github.com/prateek/wasb-parcel/tree/master/parcel-src/lib/hadoop/lib) that would integrate Azure blob storage with HDFS on the command line. I could communicate over wasb:// and copy files into blob storage, read files from blob storage, etc.

After getting a proof of concept running with the .jars I prematurely tweeted my excitement because I thought I had finally reached my goal. Why premature? Because when I tried to execute the -appendToFile command I got a sad “Operation not supported.” message.

In the end I decided to run one small Linux VM in Azure that would run Hadoop in a Docker container (https://registry.hub.docker.com/u/sequenceiq/hadoop-docker/). This way I can accept true append-only writes and persist them to HDFS. Storage is cheap so I can keep adding disks if I want. Or just rotate out old data and archive it in blob storage. Next I could copy the files into blob storage with the Azure HDFS wrapper, spin my cluster up and report to my hearts content.

This would have worked except that after going to production I started to notice that after a few hours Hadoop would get confused and start complaining about corrupted blocks. Was this because of Docker? Azure? Bad config?

2014-09-16 14:27:44,497 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream
java.io.EOFException: Premature EOF: no length prefix available
	at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1986)
	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1346)
	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1194)
	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:531)
2014-09-16 14:27:44,498 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 22acceea218d:50010:DataXceiver error processing WRITE_BLOCK operation  src: /172.17.0.5:58867 dst: /172.17.0.5:50010
java.io.IOException: Corrupted block: ReplicaBeingWritten, blk_1073741859_7815, RBW
  getNumBytes()     = 910951
  getBytesOnDisk()  = 910951
  getVisibleLength()= 910951
  getVolume()       = /tmp/hadoop-root/dfs/data/current
  getBlockFile()    = /tmp/hadoop-root/dfs/data/current/BP-953099033-10.0.0.17-1409838183920/current/rbw/blk_1073741859
  bytesAcked=910951
  bytesOnDisk=910951
	at org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.createStreams(ReplicaInPipeline.java:218)
	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:214)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:621)
	at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124)
	at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
	at java.lang.Thread.run(Thread.java:744)

I have no idea and I’m done trying to figure it out.

The next blog post will be about the shift to trying the ELK (ElasticSearch, Logstash, Kibana) stack. I hear you can pump data from Elastic to Hadoop without too much trouble.

HDInsight – Log storage attempt #1

War and Peace

If you took any time slice from my career, I could easily tell you if that time period was war time or peace time. Based on the projects, the management, my co-workers, I was either stressing out (war time) or bored out of my mind (peace time).

What I want to do with this post is contrast war and peace in a software creation environment.

War Time

It’s easy to know when you’re at war. There is a constant feeling of subtle panic every time you sit down. You leave the office drained, tired and quite often angry.

Here are some specific symptoms I’ve noticed from my career:

1. You have several outstanding production issues that are non-trivial.
2. You have abandoned quality in favor of quantity. (No unit tests, code review, etc.)
3. Stakeholders are running amok. You are strongly encouraged to say “yes” to everyone.
4. You are missing management to help shield you. You have to make managerial decisions, sit in meetings and develop politics.
5. You are working extra hours to keep up.

There are of course more symptoms. Let’s look at peace time to help emphasize the divide between the two:

Peace Time

1. You have time to read articles about your industry – without feeling guilty.
2. You have time to write unit tests.
3. You have time to re-factor code per (2).
4. You have time to think about names for objects, classes, etc.
5. You have periods of focus that go longer than one hour. It’s normal to have four interrupted hours of time.

Conclusion

Now the point of this post is not to argue that developers should try to get to peace time. The point is not to decry the horrors of war. That is obvious.

The point is that as an organization, you need to know what state you’re in.

If the development team has bloodshot eyes and the stakeholders are playing Nerf basketball, then there is a major disconnect. One team thinks it’s peace time and the other is at war.

This is bad because expectations end up missing each other by wide margins. A peace time customer will ask for 100 features for a system. If he knew the development team was at war, he wouldn’t ask for 100 features. When you’re at war, compromises and sacrifices have to be made.

For any job I’m finding it’s important to know which state I’m in and then communicate that to others. It’s only fair that the stakeholders know that I consider us to be at war. Hopefully it helps them to adjust their expectations.

War and Peace

SQL Humility

At almost every job interview I’ve ever done I’ve been asked some variation of this question:

“On a scale of 1-10, how good are you with SQL?”

I answer the same each time. I try to give the interviewer a number that reflects my knowledge of the language and my ability to answer trivia, and another number which represents my confidence with the language. The second number is what I try to emphasize.

“I’m very comfortable and confident in SQL – even if I don’t know every command/approach/technique/piece of trivia.”

I don’t know why, but writing SQL had at some point in my career become a point of pride. I thought I was a stud. At my most recent job I’ve realized I have a lot to learn.

Syntax I’ve recently come across that I had previously never seen:

1. order by 1 desc (Huh? I can specify the column number? Seriously?)
2. full join (Full join? What is that? Why can’t I just use “join”?)
3. escape ‘|’ (I can escape characters in a where statement? Hmmmmmmm.)
4. select convert(varchar, getdate(), 107) (Wait, wait, wait. I can format a date as a string with different formats?!)

The list goes on, but all the point is made. I have a lot to learn.

As a result I’ve been reading lots of articles from http://use-the-index-luke.com/, which makes me feel smart and strengthens my SQL skills.

SQL Humility

Azure+TeamCity, Build Server Fun!

At work we have a strong need for a build server with pseudo continuous integration and the task was passed to me to get something up and running. Many years ago at a different job I cobbled together Cruise Control and some NANT scripts and was able to get something like a build server going. This time around I knew I could do better and on the recommendation of a co-worker I went with TeamCity.

TeamCity is awesome! Poor CruiseControl.Net looks awfully out dated compared to TeamCity.

Some quick things I love about TeamCity:

1. Lots of integrations out of the box. It can talk to TFS, MSTest, MSBuild, Powershell, etc.
2. It just works out the box. Web portal fired up fine, the windows service installed well, etc.
3. There is a ton of support online for it. CruiseControl has some of this, but not as much as TeamCity.

It was a bit harder to get TeamCity to publish projects to Azure, but with some help of a blog or two online, I got it working.

Our process now:

1. Check out latest solution from TFS.
2. Update Nuget packages.
3. Build the entire solution.
4. Run unit tests.
5. Publish to Azure (the staging instance just to be safe).
6. Flush Redis.
7. Run integration tests.
8. Email the team that a build is successful.

It’s been a few days of work getting it all dialed, but it feels good to add some automation to our process. It was getting tiresome to deploy manually and run tests a few times a day.

Azure+TeamCity, Build Server Fun!

Battling Burnout

In many of my software jobs I’ve had to battle burnout. Things usually follow this progression:

1. Honeymoon period at job. Tons of time to do quality work up to my own standards.
2. I deliver on a few projects/features. I build a reputation.
3. The dam bursts and I drown in new projects/features.
4. I do my best to keep up and come up with solutions. I want to make people happy!
5. I burn out and quit the job.

Looking back on my career I think I had a huge part in not preventing my own burnout. I’ve tended to blame the jobs, the bosses, the bad management, but in reality I had the responsibility of keeping myself healthy and engaged.

Some things I now try to do to avoid burnout:

1. Keep eating healthy. This tends to go out the window when working too hard.
2. Keep exercising. Again, because I get so tired at work, this tends to suffer in times of burnout.
3. Push back on things that are not necessary. Saying “no” to meetings, projects, features.
4. Allowing things to fail. It’s ok to miss a deadline – usually it’s due to bad management/planning and not from me dropping the ball.
5. Reading articles, writing code not related to work. This is fun for me and it always goes out the window during stressful times at work.

We’ll see if I can learn from past mistakes and keep myself from exploding in flames. I’ve done it twice in the past and am keen to not do it again.

Battling Burnout

Oops x1.6 million

Recently I accidentally sent out 1.6 million emails to customers. I was doing local development that integrated with a third party and didn’t realize my iterative testing was actually firing off emails in production.

The vendor doesn’t have a staging/sandbox environment so extra care is always needed when doing development.

It was a pretty embarrassing situation and I tried to take a few things away from it:

1. If you are integrating with a vendor and they only have production, find a way to mock their environment.
2. Always create safe data structures for testing. Create duplicates with the third party that mimic real data structures, then swap to live objects when going live.
3. Be careful! There should be a weighty, important feeling to doing work with integration. If you’re feeling very casual and loose, then something is wrong.
4. Put in safe guards. In this case I made a whitelist of objects that could be operated on remotely. I also put in place several business rules that would prevent mistakes in the future.

Big mistakes mean opportunities to learn big.

Oops x1.6 million

Responsibility and Authority

At my last job I was the default MongoDB administrator. I lived in the Mongo shell all day for almost a solid year and I definitely used it more than any other developer, so that meant all things Mongo came to my inbox. Someone needs a document (or series of documents) updated? Send it to Ryan. Someone needs a new collection, needs data migrated from one server to another, needs to test connectivity, needs data exported? Send it to Ryan.

As our platform became more stable and more websites were deployed on top of it, Mongo began to wince a bit with the added activity. We started having odd issues with our production replica set and of course part of the investigation landed on my lap.

The point of this post is that even though the responsibility for Mongo was obviously mine, the authority to do real work was not. I could work with production data, but not SSH to the production machines.

How was I supposed to successfully see what might be causing issues in production if I couldn’t get on the box and look at the logs? How could I watch a process or look at the network activity? I couldn’t. I wasn’t given the authority to connect and do my work.

I’ve seen this happen several times in my career. I end up responsible for things, but have no power to do what I need to do in order to fix problems.

I’m happy to say that I’m getting better at identifying situations where I’m stuck being responsible without any authority to affect change.

Responsibility and Authority

Azure Fatigue

The battle rages on with Microsoft Azure. Having come from a solid year of working with Amazon’s EC2, I find working with Azure to be very frustrating at times.

A recent example is how the IP addresses – both public and private – change on instances.

For example, I have a cloud service that is up and running with a production instance and a staging instance. The private IP addresses end with .11 and .12.

I deploy my project to the staging instance and then swap staging and production once I have verified that staging is good.

What is the state of things now?

Now I have private IP addresses that end in .10 and .12. The address that ended in .11 is gone and now I have an IP with the .10 address that is not allowed to communicate with other machines since it is unrecognized.

I remember going through similar pains with EC2 (getting machines to communicate internally), but at least EC2 kept things consistent. The IPs would stick on a box that wasn’t stop and restarted. If I just rebooted a machine, the IP address would stay the same and I didn’t have this tight deployment integration that does “magic” for me.

Azure Fatigue