Notes

Big Ruby Conf 2014 Notes

These are the notes I took while attending Big Ruby 2014 in Grapevine, Texas.

Testing the untestable

  • Start with testing as though code is in a black box
  • Heroku managed this by actually just git push/deploy/build to test build packs
  • heroku run bash will put you in a terminal on your dyno
  • If heroku can deploy 6 times and have at least 1 passing test suite, they consider it successful
  • Heroku test speeds went from 5 minutes per test suite to 44 tests in 12 minutes.
    • He has worked at companies that had Rails tests going on for 30 minutes per suite
  • parallel_rspec is a plugin to look into for speeding up tests
  • Testing allowed heroku to be aggressive in refactoring.
    • They use to code Min Viable Patching because they wanted to add/modify the least amount of code per patch
  • build out smaller black boxes and unit test those smaller components
  • Untestable scenarios
    • Mock for determinism (webmock for small pages, VCR for larger network traffic)
  • codetriage.com
  • If things seem too big, start with integration tests
  • Test things that would hurt if they break.

Business intelligence

  • Supporting decision making
  • Data is generally an afterthought to devs, it’s just a storage for objects
  • How do you use the data that’s stored to make business decisions
  • The naive approach
    • “We can run reports straight from the transactional db”
    • The complex joins are slow
    • affects production performance
  • Figure out the business needs instead of just feature requests
  • They might not know about data that is valuable like (we know how loyal customers are by the last login date)
  • Turn inferences into facts
  • De-normalize data for the facts. It’s ok for data to be stored in many places if the facts are only stored once.
  • Tables can communicate more than graphs
  • “A” recipe for success
    • Use Ruby - TDD, deployment, “data munging”, many Powerful ORMs
    • Mongodb
      • Flexible schemas
      • Map/Reduce
      • New Aggregation framework – Check this out
    • Postgres for transactional
  • Minimize on-demand calculations
  • Store facts, not attributes (denormalized)
  • SQL is the recipe and the cake is the report. store the cake too.
  • Not storing objects, storing events (events don’t change)

Job processing

  • Queues and Processors
  • Queues
    • Durability, Availability, Throughput (pick two)
      • Durability - Message sent, how likely is it going to be received
      • Availability - How fault tolerant
      • Throughput - How many jobs can we push through
      • Cost?
    • RabbitMQ - Fast, durable, reasonably fault tolerant (entirely how you configure it)
    • Amazon SQS - highly Durable, highly Available, reasonable throughput, Not fast
    • Redis - High throughput, fairly durable, availability is hard, really fast
    • Not all jobs are the same, Tapjoy has roughly 2 different types of jobs
      • X00,000 / minute (events) - RabbitMQ
      • 1,000,000 / hour (general jobs) - SQS
  • Processors
    • Def
      • A way to retrieve jobs
      • A way to match that job to code
      • A way to run that code
    • Concurrently with RabbitMQ and Fibers
      • Ruby - Forks, Fibers, Threads
      • Rabbit - Async, Stateful connections (retrieve and say finished on same state), no batch processing
      • Fibers - Coroutines on the main stack, Fibers are great for async, no worries about cross thread talking
        • If CPU bound, it’s not worth it. however, IO bound is ok.
        • Pain points
          • SystemStackError - fixed, but size varies, frame for the stack under ruby (e.g., Can’t save an AR object in Fiber)
          • Needs Fiber-aware network libraries
          • Code needs to know its on a fiber
      • Fibers aren’t good for gen purpose jobs, RabbitMQ is complex, He says he wished he looked harder at Celluloid.
    • SQS Job Processor
      • SQS - HTTP API, Synchronous, allows for batch processing
      • Why not threads
        • Global Interpreter Lock
        • Thread Safety
        • Memory Bloat
      • Building the processor
        • Threads to fetch
        • For for each batch of messages
        • child has a pipe to parent to share stats
        • However, pipe processing is slow (tapjoy gets 6 children before CPU upper bounds)
        • Resolv is the Ruby DNS in Ruby instead of C to prevent GIL, use this instead of C DNS if more than 1 thread
      • Look into learning to use GDB (also useful is the LLVM equal, lldb)
      • Sometimes the bug is not your bug
      • Chore, resque compat serialization, per-job config, jobs for each server, queue agnostic and Concurrency agnostic
        • Not released yet :(

Shopify sharded Rails

  • Sharding - data over more than 1 db
  • Rails assumes 1 db
  • Auto-Incr doesn’t work
  • Normally can get by on tune and cache for years
  • Why Shard
    • Smaller indexes
    • Better localization, data stays warmer longer
  • No Joins between shards, normally there is a ‘stop and shard’ moment, shopify doesn’t do this.
  • Denormalized everything to have a shop_id since everything is scoped through shop
  • Noeqd, like snowflake for ids
  • Make sure ids are javascript safe
  • shopify ids are auto_incr + N * auto_incr
  • Rebalancing (move across shards) lock the shop and move

Living Social lightning talks

  • Check out Pry
  • text-table gem for viewing data in the console
  • Using thor to grep to view logs
  • “Telling a programmer there is already a library for that is like telling a song writer there is already a song about love”

Castle on a cloud

  • Look up ChatOps video
  • common tasks are automated through hubot commands
  • IAM is basically like LDAP. Can limit within a bucket too.
  • look into graphite for graphing. I think it’s a python app
  • VPC as a VPN
  • GitHub is a MySQL shop and use RDS for even heroku-based apps

Working effectively on a distributed team

  • As Remote Worker
    • Be visible
      • make sure people see your work
      • speak up in meetings
      • Use video, if you have to talk to someone, make it a video
      • Be visible with your calendar (we currently don’t really do use)
    • Be yourself
      • Livingsocial has an ‘off topic’ room in campfire that allows people to show personalities. Being silly and being fun
        • You need the water cooler time even online. work can be / is social
        • People need to see you in addition to your work
    • Be disciplined
      • Not just not being lazy, also be effective with the team (eg not working during odd hours)
      • Setup a dedicated workspace
    • Be available
    • Be flexible
    • Be free (e.g., if you want to move around, move around. just do your work)
  • Use Video - it well worth the additional effort, even if it’s 6fps
  • Screen Hero for paired programming (ss and audio)

Refactoring with science

  • Github use to say “never break the API” now because of usage data, they can change and contact users
  • Github deploys branches to production, not merge into master, then deploy
  • This allows for easier rollbacks by just deploying master again to revert
  • dat-science allows for basically A/B Testing of code
  • dat-analysis allows for visualizing the dat-science results

Building a service

  • Before writing a service. They write a spec
  • Curl-ish, Description, URL params, Request Body, Response (201 created, 409 exist… etc)
  • Then they write the client. It allows them to understand error handling
  • Writing configuration
  • Rack::Test is good for testing API services
  • Dev would be best in docker or vagrant, but Union metrics (Austin dev team) just ‘wings it’
  • Ops is important in SLA
  • Benefits
    • small deploys, experiments (eg with different languages) because it only has to talk HTTP
    • Experiments are double edged. happy programmers due to trying new things. but you might get boxed into something
    • client and servers are built in parallel for faster development. stubbing the server allows the client to be visualized quicker

Legacy codebase

  • It’s not a problem until it’s a problem
  • Think of working with legacy code like an archaeologist
    • You have to dig, research, piece together… all to figure out what happened and why these choices were made
    • Survey, Excavation, Analysis
  • Survey
    • Take inventory with tests, code comments, new relic, benchmarks
    • Ask dumb questions and flag myths. you’ll be able to figure out the culture which leads to insights in the code
    • Tricks -> Techniques -> Process -> Methodology -> Dogma (seems like you should watch out for this)
  • Excavation
    • If you have a goal to fix something, fix something. If something else comes out of the excavation, make an issue and get to it after your goal
    • “First, do no harm”
    • Use warn when making changes while leaving current implementation in place
  • Analysis
    • Keep up the documentation in blog posts, wiki articles, readme or even tests
    • Document everything. Even the most trivial things like initializing from scratch. Also keep it up to date
    • Make a map. Search for gems that will make a rails UML diagrams
  • The fear of looking at and working with a legacy codebase is all in your head
  • Ask why is it the way it is
  • It’s not bad code in the sense that it’s not working (it probably is working) but it just never went to refactor after green

Active interaction

  • Fat controllers, skinny models make god models
  • Rails says to focus models on NOUNS
  • Consider making models with verbs
  • active_interaction

Key models

  • RDB - Relations, transactions, schemas, ability to extend and ad-hoc queries
  • KVS - Schema-less, single-access reads, write-heavy (append only), easier to scale (restricted api), it’s just a hash

Managing fleets (could be Postgres)

  • Clint is from Missouri. Not St. Luis, Not KC. Columbus.
  • In the beginning was a sinatra app talking to aws using sequel
  • The simplest thing that could possibly work, but no less
  • Now 5 apps in sinatra, Fog to talk to aws, and still sequel
  • They use Sinatra to map to other sinatra apps based on url
  • 50 worker types, 100s of workers all doing the heavy lifting
  • each machine only has OS, Postgres, and Wal-E
  • Monitoring is from the outside in. Using State Machines and stateless workers
  • This is kind of like in a game. Loop over: observe environment, do something based on env
  • their queue is while resource, feel, do, requeue. over and over again
  • Feeler - collects data on the resource. Append data to a observation table. Finds out state as well
  • Think - Eval the state the resource is in. (eg if it’s uncertain, do x)
  • Stateless (background) workers talk to aws, heroku api, postgres because they all require network connection
  • Failure
    • notify and push to another queue
    • it either resolves or needs a human
    • the notifications lead to the development of playbooks based on the type of incident notification.
    • They have codified their playbook for different incidents that can resolve them (like restarting)
    • “Have you tried turning it off and on again?” - can resolve a lot of aws issues because images will pop up on other machines
    • If the codified resolver can’t do it, it says “I need an adult” and sends it a human
  • State Machines are great
  • Use Stateless workers
  • Expect things to break

Understanding others

  • Leveling up people skills
    • Everyone is different
  • Developers
    • Grinders - product/feature-obsessed. Iterative. “Move fast, break things”
      • Explain how process X is better to communicate
      • Use TDD
    • Tour Guides - Works across the system/stack. Helps others
      • Ask how is something the way it is.
      • 50/50 split between legacy and sharing knowledge. write it down and spread the knowledge
    • Geniuses - Thinkers, worriers. Focused on quality and experimentation
      • Ask how can we iterate towards it being better
      • Make sure they don’t want to completely throw out the current system
    • T-shaped people - Focused on one area of expertise.
      • they might be so-so at everything else except that one thing.
      • share that particular knowledge. teach their weaknesses
    • Fun leader
      • People skills with some tech knowledge
      • They can’t fix every people problem. let them cool down.
  • Learn about your self and know your own tendencies
    • Know what you need to recharge yourself. Know how to humor people
    • Know when your personality type is in conflict with the other
  • Empathy - Understand the other and their current situation and next move
    • Can be difficult online.
    • Why would someone say that?
      • Hanlon’s razor: stupid over malice
      • Occam’s razor: simple over complex
  • There is another person on the other end of the text box
  • “An adult is someone who has been around longer than one hype cycle”
  • “Upstart is 1-3 years” and wants to rewrite the world