R.I.Pienaar

Finding a new preferred VM host

01/15/2015

I’ve been with Linode since almost their day one, I’ve never had reason to complain. Over the years they have upgraded the various machines I’ve had for free, I’ve had machines with near 1000 days uptime with them, their control panel is great, their support is great. They have a mature set of value added services around the core like load balancers, backups etc. I’ve recommended them to 10s of businesses and friends who are all hosted there. In total over the years I’ve probably had or been involved in over a thousand Linode VMs.

This is changing though, I recently moved 4 machines to their London datacenter and they have all been locking up randomly. You get helpful notices saying something like “Our administrators have detected an issue affecting the physical hardware your Linode resides on.” and on pushing the matter I got:

I apologize for the amount of hardware issues that you have had to deal with lately. After viewing your account, their have been quite a few hardware issues in the past few months. Unfortunately, we cannot easily predict when hardware issues may occur, but I can assure you that our administrators do everything possible to address and eliminate the issues as they do come up.

If you do continue to have issues on this host, we would be happy to migrate your Linode to a new host in order to see if that alleviates the issues going forward.

Which is something I can understand, yes hardware fail randomly, unpredictably etc. I’m a systems guy, we’ve all been on the wrong end of this stick. But here’s the thing, in the longer than 10 years I’ve been with Linode and had customers with Linode this is something that happens very infrequently, my recent experience is completely off the scales bad. It’s clear there’s a problem, something has to be done. You expect your ISP to do something and to be transparent about it.

I have other machines at Linode London that were not all moved there on the same day and they are fine. All the machines I moved there on the same day recently have this problem. I can’t comment on how Linode allocate VMs to hosts but it seems to me there might be a bad batch of hardware or something along these lines. This is all fine, bad things happen – it’s not like Linode manufactures the hardware – I don’t mind that it’s just realistic. What I do mind is the vague non answers to the problem, I can move all my machines around and play russian roulette till it works. Or Linode can own up to having a problem and properly investigate and do something about it while being transparent with their customers.

Their community support team reached out almost a month ago after I said something on Twitter with “I’ve been chatting with our team about the hardware issues you’ve experienced these last few months trying to get more information, and will follow up with you as soon as I have something for you” I replied saying I am moving machines one by one soon as they fail but never heard back again. So I can’t really continue to support them in the face of this.

When my Yum mirror and Git repo failed recently I decided it’s time to try Digital Ocean since that seems to be what all the hipsters are on about. After a few weeks I’ve decided they are not for me.

  • Their service is pretty barebones which is fine in general – and I was warned about this on Twitter. But they do not even provide local resolvers, the machines are set up to use Google resolvers out of the box. This is just not ok at all. Support says indeed they don’t and will pass on my feedback. Yes I can run a local cache on the machine. Why should every one of thousands of VMs need this extra overhead in terms of config, monitoring, management, security etc when the ISP can provide reliable resolvers like every other ISP?
  • Their london IP addresses at some point had incorrect contact details, or were assigned to a different DC or something. But geoip databases have them being in the US which makes all sorts of things not work well. The IP whois seems fine now, but will take time to get reflected in all the geoip databases – quite annoying.
  • Their support system do not seem to send emails. I assume I just missed some click box somewhere in their dashboard because it seems inconceivable that people sit and poll the web UI while they wait for feedback from support.

On the email thing – my anti spam could have killed them as well I guess, I did not investigate this too deep because after the resolver situation became clear it seemed like wasted effort to dig into that as the resolver issue was the nail in the coffin regardless.

Technically the machine was fine – it was fast, connectivity good, IPv6 reliable etc. But for the reasons above I am trying someone else. BigV.io contacted me to try them, so giving that a go and will see how it look.

Travlrmap 1.0.0

01/10/2015

As mentioned in my previous 2 posts I’ve been working on rebuilding my travel tracker app. It’s now reached something I am happy to call version 1.0.0 so this post introduces it.

I’ve been tracking major travels, day trips etc since 1999 and plotting it on maps using various tools like the defunct Xerox Parc Map Viewer, XPlanet and eventually wrote a PHP based app to draw them on Google Maps. During the years I’ve also looked at various services to use instead so I don’t have to keep doing this myself but they all die, change business focus or hold data ransom so I am still fairly happy doing this myself.

The latest iteration of this can be seen at travels.devco.net. It’s a Ruby app that you can host on the free tier at Heroku quite easily. Features wise version 1.0.0 has:

  • Responsive design that works on mobile and PC
  • A menu of pre-defined views so you can draw attention to a certain area of the map
  • Points can be catagorized by type of visit like places you've lived, visited or transited through. Each with their own icon.
  • Points can have urls, text, images and dates associated with them
  • Point clustering that combines many points into one when zoomed out with extensive configuration options
  • Several sets of colored icons for point types and clusters. Ability to add your own.
  • A web based tool to help you visually construct the YAML snippets needed using search
  • Optional authentication around the geocoder
  • Google Analytics support
  • Export to KML for viewing in tools like Google Earth
  • Full control over the Google Map like enabling or disabling the street view options

It’s important to note the intended use isn’t something like a private Foursquare or Facebook checkin service, it’s not about tracking every coffee shop. Instead it’s for tracking major city or attraction level places you’ve been to. I’m contemplating adding a mobile app to make it easier to log visits while you’re out and about but it won’t become a checkin type service.

I specifically do not use a database or anything like that, it’s just YAML files that you can check into GitHub, easily backup and hopefully never loose. Data longevity is the most important aspect for me so the input format is simple and easy to convert to others like JSON or KML. This also means I do not currently let the app write into any data files where it’s hosted. I do not want to have to figure out the mechanics of not loosing some YAML file sat nowhere else but a webserver. Though I am planning to enable writing to a incoming YAML file as mentioned above.

Getting going with your own is really easy. Open up a free Heroku account and set up a free app with one dynamo. Clone the demo site into your own GitHub and push to Heroku. That’s it, you should have your own up and running with place holder content ready to start receiving your own points which you can make using the included geocoder. You can also host it on any Ruby based app server like Passenger without any modification from the Heroku one.

The code is on GitHub ripienaar/travlrmap under Apache 2. Docs for using it and configuration references are on it’s dedicated gh-pages page.

Marker clustering using GMaps.js

01/05/2015

In a previous post I showed that I am using a KML file as input into GMaps.js to put a bunch of points on a map for my travels site. This worked great, but I really want to do some marker clustering since too many points is pretty bad looking as can be seen below.

I’d much rather do some clustering and expand out to multiple points when you zoom in like here:

Turns out there are a few libraries for this already, I started out with one called Marker Clusterer but ended up with a improved version of this called Marker Clusterer Plus. And GMaps.js supports cluster libraries natively so should be easy, right?

Turns out the way Google maps loads KML files is done using a layer over the map and the points just are not accessible to any libraries, so the cluster libraries does nothing. Ok, so back to drawing points using my code.

I added a endpoint to the app that emits my points as JSON:

[
  {"type":"visit",
   "country":"Finland",
   "title":"Helsinki",
   "lat":60.1333,
   "popup_html":"<p>\n<font size=\"+2\">Helsinki</font>\n<hr>\nBusiness trip in 2005<br /><br />\n\n</p>\n",
   "comment":"Business trip in 2005",
   "lon":25,
   "icon":"/markers/marker-RED-REGULAR.png"}
]

Now adding all the points and getting them clustered is pretty easy:

<script type="text/javascript">
    var map;
    function addPoints(data) {
      var markers_data = [];
      if (data.length > 0) {
        for (var i = 0; i < data.length; i++) {
          markers_data.push({
            lat: data[i].lat,
            lng: data[i].lon,
            title: data[i].title,
            icon: data[i].icon,
            infoWindow: {
              content: data[i].popup_html
            },
          });
        }
      }
      map.addMarkers(markers_data);
    }
 
    $(document).ready(function(){
      infoWindow = new google.maps.InfoWindow({});
      map = new GMaps({
        div: '#main_map',
        zoom: 15,
        lat: 0,
        lng: 20,
        markerClusterer: function(map) {
          options = {
            gridSize: 40
          }
 
          return new MarkerClusterer(map, [], options);
        }
      });
 
      points = $.getJSON("/points/json");
      points.done(addPoints);
    });
</script>

This is pretty simple, the GMaps() object takes a markerClusterer option that expects an instance of the clusterer. I fetch the JSON data and each row gets added as a point. Then it all just happens automagically. Marker Clusterer Plus can take a ton of options that lets you specify custom icons, grid sizes, tweak when to kick in clustering etc. Here I am just setting the gridSize to show how to do that. In this example I have custom icons used for the clustering, might blog about that later when I figured out how to get them to behave perfectly.

You can see this in action on my travel site. As an aside I’ve taken a bit of time to document how the Sinatra app works and put together a demo deployable to Heroku that should give people hints on how to get going if anyone wants to make a map of their own.

Ruby, Google Maps and KML

12/26/2014

Since 1999 I kept record of most places I’ve traveled to. In the old days I used a map viewer from PARC Xerox to view these travels, I then used XPlanet which made a static image. Back in 2005 as Google Maps became usable from Javascript I made something to show my travels on an interactive map. It was using Gmaps EZ and PHP to draw points from a XML file.

Since then google made their v2 API defunct and something went bad with the old php code and so the time came to revisit all of this into the 4th iteration of a map tracking my travels.

Google Earth came out in 2005 as well – so just a bit late for me to use it’s data formats – but today it seems obvious that the data belong in a KML file. Hand building KML files though is not on, so I needed something to build the KML file in Ruby.

My new app maintains points in YAML files, have more or less an identical format to the old PHP system.

First to let people come up with categories of points you define a bunch of types of points first:

:types:
  :visit:
    :icon: http://your.site/markers/mini-ORANGE-BLANK.png
  :transit:
    :icon: http://your.site/markers/mini-BLUE-BLANK.png
  :lived:
    :icon: http://your.site/markers/mini-GREEN-BLANK.png

And then we have a series of points each referencing a type:

:points:
- :type: :visit
  :lon: -73.961334
  :title: New York
  :lat: 40.784506
  :country: United States
  :comment: Sample Data
  :href: http://en.wikipedia.org/wiki/New_York
  :linktext: Wikipedia
- :type: :transit
  :lon: -71.046524
  :title: Boston
  :lat: 42.363871
  :country: United States
  :comment: Sample Data
  :href: http://en.wikipedia.org/wiki/Boston
  :linkimg: https://pbs.twimg.com/profile_images/430836891198320640/_-25bnPr.jpeg

Here we have 2 points, both link to Wikipedia one using text and one using an image, one is a visit and one is a transit.

I use the ruby_kml Gem to convert this into KML:

First we set up the basic document and we set up the types using KML styles:

kml = KMLFile.new
document = KML::Document.new(:name => "Travlrmap Data")
 
@config[:types].each do |k, t|
  document.styles << KML::Style.new(
    :id         => "travlrmap-#{k}-style",
    :icon_style => KML::IconStyle.new(:icon => KML::Icon.new(:href => t[:icon]))
  )
end

This sets up the types and give them names like travlrmap-visited-style.

We’ll now reference these in the KML file for each point:

folder = KML::Folder.new(:name => "Countries")
folders = {}
 
@points.sort_by{|p| p[:country]}.each do |point|
  unless folders[point[:country]]
    folder.features << folders[point[:country]] = KML::Folder.new(:name => point[:country])
  end
 
  folders[point[:country]].features << KML::Placemark.new(
    :name        => point[:title],
    :description => point_comment(point),
    :geometry    => KML::Point.new(:coordinates => {:lat => point[:lat], :lng => point[:lon]}),
    :style_url   => "#travlrmap-#{point[:type]}-style"
  )
end
 
document.features << folder
kml.objects << document
 
kml.render

The points are put in folders by individual country. So in Google Earth I get a nice list of countries to enable/disable as I please etc.

I am not showing how I create the comment html here – it’s the point_comment method – it’s just boring code with a bunch of if’s around linkimg, linktext and href. KML documents does not support all of HTML but the basics are there so this is pretty easy.

So this is the basics of making a KML file from your own data, it’s fairly easy though the docs for ruby_kml isn’t that hot and specifically don’t tell you that you have to wrap all the points and styles and so forth in a document as I have done here – it seems a recent requirement of the KML spec though.

Next up we have to get this stuff onto a google map in a browser. As KML is the format Google Earth uses it’s safe to assume the Google Maps API support this stuff directly. Still, a bit of sugar around the Google APIs are nice because they can be a bit verbose. Previously I used GMapsEZ – which I ended up really hating as the author did all kinds of things like refuse to make it available for download instead hosting it on a unstable host. Now I’d say you must use gmaps.js to make it real easy.

For viewing a KML file, you basically just need this – more or less directly from their docs – there’s some ERB template stuff in here to set up the default view port etc:

<script type="text/javascript">
    var map;
    $(document).ready(function(){
      infoWindow = new google.maps.InfoWindow({});
      map = new GMaps({
        div: '#main_map',
        zoom: <%= @map_view[:zoom] %>,
        lat: <%= @map_view[:lat] %>,
        lng: <%= @map_view[:lon] %>,
      });
      map.loadFromKML({
        url: 'http://your.site/kml',
        suppressInfoWindows: true,
        preserveViewport: true,
        events: {
          click: function(point){
            infoWindow.setContent(point.featureData.infoWindowHtml);
            infoWindow.setPosition(point.latLng);
            infoWindow.open(map.map);
          }
        }
      });
    });
</script>

Make sure there’s a main_map div setup with your desired size and the map will show up there. Really easy.

You can see this working on my new travel site at travels.devco.net. The code is on Github as usual but it’s a bit early days for general use or release. The generated KML file can be fetched here.

Right now it supports a subset of older PHP code features – mainly drawing lines is missing. I hope to add a way to provide some kind of index to GPX files to show tracks as I have a few of those. Turning a GPX file into a KML file is pretty easy and the above JS code should show it without modification.

I’ll post a follow up here once the code is sharable, if you’re brave though and know ruby you can grab the travlrmap gem to install your own.

The problem with params.pp

12/09/2013

My recent post about using Hiera data in modules has had a great level of discussion already, several thousand blog views, comments, tweets and private messages on IRC. Thanks for the support and encouragement – it’s clear this is a very important topic.

I want to expand on yesterdays post by giving some background information on the underlying motivations that caused me to write this feature and why having it as a forge module is highly undesirable but the only current option.

At the heart of this discussion is the params.pp pattern and general problems with it. To recap, the basic idea is to embed all your default data into a file params.pp typically in huge case statements and then reference this data as default. Some examples of this are the puppetlabs-ntp module, the Beginners Guide to Modules and the example I had in the previous post that I’ll reproduce below:

# ntp/manifests/init.pp
class ntp (
     # allow for overrides using resource syntax or data bindings
     $config = $ntp::params::config,
     $keys_file = $ntp::params::keys_file
   ) inherits ntp::params {
 
   # validate values supplied
   validate_absolute_path($config)
   validate_absolute_path($keys_file)
 
   # optionally derive new data from supplied data
 
   # use data
   file{$config:
      ....
   }
}

# ntp/manifests/params.pp
class ntp::params {
   # set OS specific values
   case $::osfamily {
      'AIX': {
         $config = "/etc/ntp.conf"
         $keys_file = '/etc/ntp.keys'
      }
 
      'Debian': {
         $config = "/etc/ntp.conf"
         $keys_file = '/etc/ntp/keys'
      }
 
      'RedHat': {
         $config = "/etc/ntp.conf"
         $keys_file = '/etc/ntp/keys'
      }
 
      default: {
         fail("The ${module_name} module is not supported on an ${::osfamily} based system.")
      }
   }
}

Now today as Puppet stands this is pretty much the best we can hope for. This achieves a lot of useful things:

  • The data that provides OS support is contained and separate
  • You can override it using resource style syntax or Puppet 3 data bindings
  • The data provided using any means are validated
  • New data can be derived by combining supplied or default data

You can now stick this module on the forge and users can use it, it supports many Operating Systems and pretty much works on any Puppet going back quite a way. These are all good things.

The list above also demonstrates the main purpose for having data in a module – different OS/environment support, allowing users to supply their own data, validation and to transmogrify the data. The params.pp pattern achieves all of this.

So what’s the problem then?

The problem is: the data is in the code. In the pre extlookup and Hiera days we put our site data in a case statements or inheritance trees or node data or any of number of different solutions. These all solved the basic problem – our site got configured and our boxes got built just like the params.pp pattern solves the basic problem. But we wanted more, we wanted our data separate from our code. Not only did it seem natural because almost every other known programming language supports and embrace this but as Puppet users we wanted a number of things:

  • Less logic, syntax, punctuation and “programming” and more just files that look a whole lot like configuration
  • Better layering than inheritance and other tools at our disposal allowed. We want to structure our configuration like we do our DCs and environments and other components – these form a natural series of layered hierarchies.
  • We do not want to change code when we want to use it, we want to configure that code to behave according to our site needs. In a CM world data is configuration.
  • If we’re in a environment that do not let us open source our work or contribute to open source repositories we do not want to be forced to fork and modify open source code just to use it in our environments. We want to configure the code. Compliance needs should not force us to solve every problem in house.
  • We want to plug into existing data sources like LDAP or be able to create self service portals for our users to supply this configuration data. But we do not want to change our manifests to achieve this.
  • We do not want to be experts at using source control systems. We use them, we love them and agree they are needed. But like everything less is more. Simple is better. A small simple workflow we can manage at 2am is better than a complex one.
  • We want systems we can reason about. A system that takes configuration in the form of data trumps one that needs programming to change its behaviour
  • Above all we want a system that’s designed with our use cases in mind. Our User Experience needs are different from programmers. Our data needs are different and hugely complex. Our CM system must both guide in its design and be compatible with our existing approaches. We do not want to have to write our own external node data sources simply because our language do not provide solid solutions to this common problem.

I created Hiera with these items in mind after years of talking to probably 1000+ users and iterating on extlookup in order to keep pace with the Puppet language gaining support for modern constructs like Hashes. True it’s not a perfect solution to all these points – transparency of data origin to name but one – but there are approaches to make small improvements to achieve these and it does solve a high % of the above problems.

Over time Hiera has gained a tremendous following – it’s now the de facto standard to solving the problem of site configuration data largely because it’s pragmatic, simple and designed to suit the task at hand. In recognition of this I donated the code to Puppet Labs and to their credit they integrated it as a default prerequisite and created the data binding systems. The elephant in the room is our modules though.

We want to share our modules with other users. To do this we need to support many operating systems. To do this we need to create a lot of data in the modules. We can’t use Hiera to do this in a portable fashion because the module system needs improvement. So we’re stuck in the proverbial dark ages by embedding our data in code and gaining none of the advantages Hiera brings to site data.

Now we have a few options open to us. We can just suck it up and keep writing params.pp files gaining none of the above advantages that Hiera brings. This is not great and the puppetlabs-ntp module example I cited shows why. We can come up with ever more elaborate ways to wrap and extend and override the data provided in a params.pp or even far out ideas like having the data binding system query the params.pp data directly. In other words we can pander to the status quo, we can assume we cannot improve the system instead we have to iterate on an inherently bad idea. The alternative is to improve Puppet.

Every time the question of params.pp comes up the answer seems to be how to improve how we embed data in the code. This is absolutely the wrong answer. The answer should be how do we improve Puppet so that we do not have to embed data in code. We know people want this, the popularity and wide adoption of Hiera has shown that they do. The core advantages of Hiera might not be well understood by all but the userbase do understand and treasure the gains they get from using it.

Our task is to support the community in the investment they made in Hiera. We should not be rewriting it in a non backwards compatible way throwing away past learnings simply because we do not want to understand how we got here. We should be iterating with small additions and rounding out this feature as one solid ever present data system that every user of Puppet can rely on being present on every Puppet system.

Hiera adoption has reached critical mass, it’s now the solution to the problem. This is a great and historical moment for the Puppet Community, to rewrite it or throw it away or propose orthogonal solutions to this problem space is to do a great disservice to the community and the Puppet product as a whole.

Towards this I created a Hiera backend that goes a way to resolve this in a way thats a natural progression of the design of Hiera. It improves the core features provided by Puppet in a way that will allow better patterns than the current params.pp one to be created that will in the long run greatly improve the module writing and sharing experience. This is what my previous blog post introduce, a way forward from the current params.pp situation.

Now by rights a solution to this problem belong in Puppet core. A Puppet Forge dependant module just to get this ability, especially one not maintained by Puppet Labs, especially one that monkey patches its way into the system is not desirable at all. This is why the code was a PR first. The only alternatives are to wait in the dark – numerous queries by many members of the community to the Puppet product owner has yielded only vague statements of intent or outcome. Or we can take it on our hands to improve the system.

So I hope the community will support me in using this module and work with me to come up with better patterns to replace the params.pp ones. Iterating on and improving the system as a whole rather than just suck up the status quo and not move forward.

Newer Posts
Older Posts