Engineering Problem, or Tool Blinders?

Tony and I met with a former co-worker early this year, to talk about his use of MongoDB.  For some reason, they thought Tony was an authority on this.  I just came along for the lunch.

The Question

The gist of the discussion was this:  how best to store large binary objects in the database?  Was Mongo the best choice, or would PostgreSQL be better?  They were using both, but some devs were making a concerted effort to get over to MongoDB for new work.

After about 15 minutes of (kinda non-) discussion, I butted in: “I think you’re asking the wrong question.  Do those files even NEED to be in the database?”

Metadata about the files?  Absolutely.  But the files themselves, probably not.

The Actual Problem

Databases are good at a few things:

  • Finding things quickly (indexing, hashing, etc.)
  • Keeping things straight between many competing threads (ACID stuff)

You know what they’re usually bad at?  Sending or receiving large field values (I mean > 512KB or so), especially within a transaction.  Especially under heavy concurrent load.

I can hear the DBAs out there yelling at the page that of course it can be done.  I didn’t say it was impossible, it’s just soooo not what database are good at.  While various database platform vendors have tried mightily over the years, storing large objects in your database is still possibly the worst physical database design decision you can make:

  • Worse than adding all of your query fields to the PK.
  • Even worse than getting involved in a land war in Asia.
  • Worse than, (gasp), stored procedures?  Well, that might be a closer race, but let’s leave stored procedures for another day.

The GForge Approach (is nothing special)

I’ve always thought that GForge handled files in the clunkiest, lowest-tech way possible – on the file system.  It still does, and even the Big New Version will continue to do so.  But it wasn’t until this discussion that I actually became an advocate for this approach.  Why?  Because it’s only clunky until you compare it to all of the other approaches.

GForge stores all of the metadata (date, name, size, author, project, etc) in the database, we even add some keyword indexing in PostgreSQL’s excellent tsearch facility.  But when it’s time to upload or download one of these files, we use FTP or HTTP.  We get away from the DB when it’s time to do something that might run longer than a second, and we play to the design strengths of those other protocols.  And while nearly all of our customers service web and file traffic from one server, GForge does allow configuration of different hosts to offload those tasks.

This shouldn’t come as a surprise to anyone.  It didn’t really come as a surprise to our friends that asked the original question.  But I could tell they didn’t really want to hear that the best solution didn’t involve new, shiny stuff. I understood that, and I still do.  The best coders are the ones who do it even when no one’s paying them, just to try things out and learn.  You don’t want to squash that initiative, but it can’t be your main criteria for choosing tools to solve business problems.

Have fun out there, just don’t let the shiny get in the way of the job.

 

Epilogue

I heard from my former co-worker a couple of weeks ago.  They went with S3 storage, indexed metadata in a (PostgreSQL) database, and links between the two.  They managed to find a solution that was both powerful and shiny.  Kudos to them!

Real-World Scenarios, Episode 1: Changecause

The folks at Changecause were good enough to publish a blog post about their efforts to make bug reporting easier for people outside their team.  It was a clever solution for gathering issues from end-users, but there are also a few trade-offs at work.  GForge satisfies the same requirements in a much more elegant (and supportable) way.

Third-Party Integration, Squared

The first anti-pattern is integrating two third-party tools to each other.  Yes, it’s neat, and it’s fun, and I’ve done it, too.  Heck, github has dozens of third-party integrations – so cool.  But what happens when one endpoint changes its behavior, its API signature, or just goes away?  Who do you get help from?

kid pointing both waysHey kid, where’s the trouble?

It’s trivially easy to get caught by a problem like this.  In fact, I’d say it’s inevitable.  And it’s outside your control.  At my last job, this happened to us three times in about six months, with some A-list players.

You may be paying (probably too much !) for your task-management tool, in which case at least you will have a defined service level – that is, someone you’re paying to help you out when things don’t work.  But most small and medium-size software shops rely primarily on free tools, which usually means you’re on your own.  Even if you have an SLA with both (or all) involved vendors, it is extremely likely that they won’t agree on the source of the problem, or its solution.

The Core Competency Question

One of the reasons that these ad-hoc integrations happen in the first place is that it’s software, which is probably your personal core competency anyway.  You spend an hour building something, and it works.  You get a good amount of value out of a minimal amount of your time, and you exercise some control over your otherwise frantic and unpredictable startup experience.

Except that this integration is not your company’s core competency.  Neither is bug tracking, or version control, or DBMS, or any of the other foundational tools that you use to build, e.g., Changecause.  So that hour you spent may have saved some other hours of distraction, handling complaint emails, but it didn’t add a new feature to your actual product.  And, over the next couple of weeks you’ll spend another eight hours tinkering with the integration to add a field, to handle an API change, or to update the API key again.  At that point, you may still be breaking even but it’s clearly not a big win.

Edit: While waiting for my other GForgers to give me their feedback, I happened across this pretty relevant blog post.  I’ve bookmarked it for yet another blog post in the future.

Okay, Smarty Pants

…how would you do it with GForge, then?  I thought you’d never ask.

I would build that same bug submission form in your website, instead of embedding the Google Doc form.  Gather and validate the data using your existing web app framework, like you’re doing for the rest of your app (instead of a different technology, with a different set of quirks and bugs).  Then I’d pack it all up on your back-end server, and send an email to your GForge project.

GForge has really good integration with email.  You can create a bug/ticket/suggestion or whatever you want via email, by sending to the right email address.  By default, it’s [projectname]-[trackername]@[gforgehost], e.g., gforge-support@gforge.com.  You can even customize the email address, e.g. support@gforge.com, which is what we do for customer support.  Customers can just send us an email to start a support request, and the GForge Support Tracker captures the entire conversation, including attachments (like screen shots, logs, etc.).

It’s still a minor diversion from your core competency.  But at least it’s a direct connection between your own technology (which you’re responsible for, anyway) and GForge, which we support every day, for some of the biggest companies in the world.  If you want to tweak the form, ask another question (or allow a screen shot), go for it – GForge will still capture everything you send in the e-mail, just the way you sent it.

If you’d like to try it out for yourself, start a free project at gforge.com, or visit gforgegroup.com to download the installer and run it on your own server.  If you’re trying it out and have questions or comments, let us know!

Thanks,

M.

PS – I also enjoyed another blog posting by Changecause, this one about their internal planning/task workflow.  It’s somewhat similar to where we’re going internally, and has inspired me to build a GForge template.  I’ll post an update about it sometime soon.

Raytheon: Not Enough New Cyber-Security Talent

GForge customer Raytheon just posted this article in their newsroom about the lack of cyber-security workers, not only in the immediate pipeline, but all the way up the chain – our high school kids aren’t even being told it’s a career.

Given the security hoops we jump through for them, I’m inclined to believe them.

Read the full article here: http://raytheon.mediaroom.com/index.php?s=43&item=2435

GForge Live Discussion (aka Chat)

One of the big new features in 6.2.1 was the Live view on our Discussions plugin.  It’s basically a chat room, about a project, a document, or just about any other object on your GForge site.  All of the conversations are automatically saved as Discussion Threads for later viewing, and are searchable along with everything else in your project.  They’re also access-controlled, so you can allow the right people in on your sensitive discussions.

The best part?  No installs, no widgets, no special ports to open or configure.  It’s all regular web traffic in a regular browser window.

We use Chat all day long at GForge – it’s a huge productivity tool for folks that can’t (or don’t want to) yell over a cube wall.

If you’re not already using Chat, you should definitely check it out.  Here are three short videos by our own Olivia, detailing three great features that make our Chat one of a kind.

 

Project Activity Feed

From any Chat tab, each user can choose to see project-related activity as it happens.  This is great for keeping up with what’s going on, without having to ask anyone what they’re doing.

 

Auto-Link, Auto-Preview

When you post through the Chat window (or via email, or directly in the Discussions web page), GForge automatically picks up on what you’ve entered.  We’ll pre-render graphics, embed the YouTube player, show a nice preview block for other URLs, and even provide links to other GForge objects that you mention by ID.  And when you mention something in GForge, we’ll also add a note to that item, tying back to the Discussion where it was discussed.

 

Emoticons, Sounds, Images

Aside from the very real productivity and team benefits, it’s also fun to make a little noise once in a while.  GForge has a huge set of emoticons that you can click on or type in to let others know what you think.  There’s also a sounds button, with an expandable set of sound clips you can play for everyone.

GForge in One Minute

So, Intern Olivia Treu recently headed back to school…but she left behind a whole raft of how-to screencasts about GForge features.  So many, in fact, that we had her created a YouTube channel to keep them all organized.

Since we’re rapidly wrapping up the 6.3 release, I’m going to highlight a new screencast or two every week for the next few weeks, starting with the original GForge In One Minute:

Enjoy!

What To Expect in GForge 6.3

Summer has really flown by. We’ve been so busy adding features and fixes for GForge 6.3, I didn’t even get to the State Fair this year. As we begin winding down, I thought I’d take a few minutes and share the current list of what’s done now and what’s left to do for our release in October.

Before I get into the features and improvements, I want to remind everyone about our Customer Repositories. As you’re reading about all the great things in 6.3, please remember that you can have them right now, by cloning our customer repository and installing the current version of 6.3 in the customer-next branch. It’s the same code we run every day on gforge.com. Test it out, tell your friends, and let us know your feedback.

Now, on to the changes!

What’s New (right now)

There are literally hundreds of individual changes (link), and that level of detail has never made sense to me as a customer. So for this posting, I’m going to talk less about the exact technical changes and more about how the features help you and your team.  Even so, there’s waaaaay too much to wade through.  So, I’ll make it easier by listing the topics below.

Tracker

Project

Site/Admin

Tracker Email Integration Burn-Down Enhancements Trove is Back
Tracker Extra Fields Project Ratings and Reviews Jenkins Integration
Tracker Browse Project Invites Security Enhancements
Tracker Item Project Nav Item Counts Site Administration
Support Tracker New “Group Member” Access Control
Commits, Tracker Workflow and Time Tracking

Tracker

Tracker Email Integration

Lots of people (ourselves included) have customers who don’t have the time or inclination to visit a GForge site.  Since everyone already uses email, it’s an obvious choice for bridging the gap.

  • Want to let your business customers, testers, or users create Tracker Items just by sending an email?  Now you can.
  • Of course, it would be really great if they got a response back, so they do.  Want to customize what it says?  Check.
  • Do they want to attach a screen shot showing the problem, a spreadsheet with the right numbers, or the log file you need?  We’ve got that covered, too.
  • Wouldn’t it be nice if folks could add a follow-up to the TI by replying to the email?  We thought so, too.
  • Want to use email on any TI?  No problem – now, every Tracker Item has its own email address, too.

Tracker Extra Fields

So many neat things to play with:

  • Don’t need Start Date, Duration or the other “standard” fields for your Tracker?  Now you can turn them off.  In fact, you can turn off any of the default fields for a Tracker.
  • We’ve moved the Tracker Admin options from a separate “admin” page to the regular list of Trackers in each project.  Less clicks!
  • You can now assign a short (alias) name to each value in a select or multi-select type Extra Field.   When used for Status, these aliases can be part of a commit message to trigger workflow on Tracker Items (See Commits, Tracker Workflow and Time Tracking, below).
  • You can now change the Open/Closed category for a Status element without having to delete and re-create the element.
  • New Extra Field: Member List.  This field can show users, site groups or both, and is used to drive “Member Of” access control (see “Group Member” Access Control, below).

Tracker Browse

  • Quick Browse & File Export – We’ve cleaned up the Quick Browse and Query area so it’s less cluttered and clearer to use.  Also, you can get a one-click CSV, XLS or XML download of the exact list you’re looking at (even for multiple pages), whether it’s a Quick Browse or a saved Tracker Query.
  • Grid Updates – You can choose which standard fields (like start/end date) show up in the grid display, just like Tracker Extra Fields.  And we’ve removed the “Delete” link from each row because, really, who uses that?

Tracker Item

  • Submitter Field – We’ve made the Submitter field editable.  If you enter a Tracker Item on behalf of someone else, it can still belong to them.
  • Edit/Delete Follow-Ups – How often does the discussion around a tough problem change course on you?  Some of the notes that are left after it’s solved can be downright misleading.  So now, you can fix the wording on some follow-ups, and delete the ones you don’t need anymore.  These actions are part of the Tracker Item’s change history, so your audit trail is still intact.
  • Smarter Activity Logging – We’ve centralized storage of significant events across all Project plugins (e.g., Tracker, Docman, Wiki, Discussions, etc.), so that the Activity feed in Chat, the Activity view on your Project Home Page, and Activity reporting are all identical, all the time.
  • Assign to “Nobody” – Automatic assignment as part of workflow is pretty cool.  But we ran into some scenarios where we actually wanted a remove any assignees, such as when the Tracker Item is closed.  So now, in the Workflow Transition edit page, you can specify a person to be assigned, or Nobody.

Support Tracker

This one is pretty big – Tony has already written an entire post about it.  We been using an outdated, open-source system for tracking customers, license keys, support tickets and knowledge base articles for a long time.  Like, multiple Presidents ago.  Many of the features listed here were things we needed to bring all of that support directly into GForge.  Starting soon, you’ll be able to open and reply to support tickets via email, get your license keys, and see all of the support incidents for your organization through gforgegroup.com and gforge.com.

What’s really great is that everyone can take advantage of the same features to provide support for GForge, or for your GForge-based projects just like we do.  If you provide support of any kind, to internal or external customers, you should take a good look at Support Tracker.  Or, just ask us for a tour.

Commits, Tracker Workflow and Time Tracking

GForge already lets you tie files changed in your SCM (CVS, SVN, Git and others) to the Tracker Item you’re working on.  That kind of traceability is huge for code reviews, change management and quality control.  But I’ve always been a little envious of Github’s ability to close issues with a commit.  For developers, it means you commit your code, and don’t have to go back and update a web page, which is a real time-saver.  It’s such a great feature, with one little problem:  in what world are developers allowed to close bugs?  Plus, as much as agile is wonderful and fun, it is still a great idea to know how much time something really took – another area where Github’s approach really falls short.

So we’re going to go Github one better – well, THREE better, if you’re counting:

  1. Branch Name -> Tracker Item – Putting the Tracker Item ID in the commit message is cool.  For the first two or three messages.  But it gets boring pretty quickly.  How about putting #12345 in the branch name instead, and we’ll associate all of the commits to that Tracker Item.  More than one TI (like, fixit_#12345_#23456)?  Not a problem.  Have #12345 in the branch name, and need to point one commit to #23456?  Do it, we’ll handle it for you.
  2. Commit Message -> Workflow – Usually there’s a step or two after you’ve done the work – testing, customer evaluation, staging for release, all that.  If you define the workflow for a Tracker, you can use the status aliases in your commit message to set the next status for a Tracker Item.  When you push that commit to the GForge repository, it will automatically move the Tracker Item along.  For example, add [#12345,test] to the message, and we’ll set #12345 to status “Ready For Test”.
  3. Time Reporting – No one likes reporting time, because you always do it after the work is done.  Like the next day, or a week later.  If you could report time to a Tracker Item right when you did it, not only would it be brain-dead simple, could it actually become….accurate??  Try this in your commit message: [#12345,test,1.5] to tie your commit to #12345, AND move it to “Ready For Test” status, AND log an hour an a half of working time to it.

Project

Burn-Down Enhancements

In 6.2.1, we added Burn-down charts for open/closed Tracker Items associated to each Release.  And that was pretty cool.  For 6.3, we’ve improved the graphing algorithm to give more accurate predictions on velocity and completion date.  We also made it possible to add burn-down charts to your Project Home Page, by adding the %%BURNDOWN%% keyword.  You can add as many as you like, and put HTML DIVs around them to customize the layout and size.

What’s more you can also choose any Tracker Query for the data source.  This Week’s Work, Jeremy’s Bugs, whatever you want, on the front page of the project.

Project Ratings and Reviews

Now your users can leave 5-star ratings on projects, and also add short text reviews.  Ratings and reviews are viewable as part of the project information, to anyone with read permissions.  The average rating can also be part of search results when looking for Projects.

Project Invites

A simple but powerful feature that we’ve been asked for several times – Project Admins can now invite users to join a project.  Enter a list of email addresses, and GForge will send the invitation with a link.  You can invite existing GForge users and new ones in the same batch.  You can even pick the Project role for those users when they join.  One step, done.

Project Nav Item Counts

Different projects use GForge differently – some need lots of mailing lists, some have lots of News/Blog postings, and so on.  To make it easier to see where the good stuff is on each project, we’ve added item counts to primary elements in the Project-level menu (left-side nav).  You’ll be able to see how many Releases, how many root-level Documents and Folders, Blog Postings, and other things you’ve got, without having to click through to each area of the project.

New “Group Member” Access Control

Remember that new “Member List” Extra Field I mentioned before?  Member List fields can be used to drive access control to the Tracker and Tracker Items based on the value.  For example, you can put users from Customer A in one site group, and users from Customer B in another site group.  Then, using the Member List field (let’s label it “Customer” for kicks), you can ensure that only users in Customer A see Tracker Items with that group selected in the “Customer” field.

Member List fields automatically show up in Project Admin’s Role Edit page for Trackers where they are added.

Site/Admin

Trove is Back

Yes, we’ve brought back the Trove system for categorizing projects across many different subjects. Site administrators can create whatever categories your enterprise might need, for technology, project phase, geographical area, business unit, compliance – anything you’d like. Project admins can select the right values for their projects, and update them over time.

In 6.3, we’ve made several improvements to Trove to make it more useful for cross-cutting concerns like PMO/budget/performance management, architecture and regulatory compliance, licensing and component re-use, etc.

  • Searching for Projects now includes Trove categories in the search. Use the category names and find projects with the matching Trove entry.
  • Search results can include selected Trove categories in the result table. Site admins can define the specific categories to be displayed. So now you can have Project Phase, Business Unit, or whatever other important information shown right away.
  • Site admins can now set access controls on Trove categories, for information that needs to be managed outside of the project.  We have a customer using this feature for rating the re-use potential and readiness of their individual software projects.  Other orgs may choose to rate project performance, architectural, process or regulatory compliance, or even project success/failure (if there is such a thing!) – anything that might need to be evaluated by people outside the project itself.

Jenkins Integration

We’ve updated our Jenkins plugin code to the newest versions of the Jenkins API, and improved the information we send back to GForge about builds.  You’ll now see activity records each time the build starts and ends, as well as each build result.  Turn on the Activity feed in your Project Chat and see builds happen in real time!

Security Enhancements

  • Better protection against frame-based click-jacking
  • Removal of server info from the HTTP response
  • Protection against bot-based account harvesting, site admins can choose how much user profile info to expose to anonymous users and registered users.
  • When updating your SSH key (for SCM access), users will now be required to enter their password as part of the update.
  • Site admins can set accounts to lock after n bad attempts.  Use 0 to disable this new feature, the default is 5.  Admins can re-enable accounts through the GForge Site Admin pages.

Site Administration

  • Site Admins can now find orphaned and dead projects with some new search criteria.  Click on the “Projects” top-nav tab to see the criteria and search.  Then select a set of projects and mark them all for removal with one click.  Don’t worry, we’ll still ask if you’re sure first.
  • Did your SVN repository get out of sync with GForge commit history/messages?  We’ve got a script for that.  Run the re-parse script to go back through SVN commits and rebuild the corresponding data in GForge.

What’s On The Way (6.3 final)

Well, we’re not done yet.  Here’s what we’re going to finish in the next few weeks.

  • Search Stats – We’ll keep track of popular keywords, and show the top searches to help users find what they’re looking for, and help site admins understand what’s hot.
  • E-Mail Project Members – Project Admins will be able to send email to all members of a project.
  • Generated robots.txt – GForge will automatically offer up a robots.txt, to make sure that search engines don’t choke your server
  • Executable cronjob scripts – A minor enhancement to make manually running the occasional cron script a bit easier.
  • Tracker Browse w/GET – By putting all of the filter params into the URL, you’ll be able to easily copy and paste exact searches to share or bookmark.
  • Tracker Query on Tags – Users will be able to filter Tracker Items by tag values
  • User Stories Tracker Template – A first-class implementation of Use Cases/User Stories that you’ll be able to clone into your own project.  All set up and ready to, uh, agile.
  • Centralized “@”-mentions – Being mentioned in live chat, discussions, Tracker Items or even commit messages will go to your Message Wall, your Chat window, the Notifications list, or the Growl-style notifier, depending on how you’re available at the moment.
  • SCM Access Control from GForge Database – Instead of generating files for Git access control, a hook script will query GForge users, roles and project memberships in real time.  No more giant ACL files for large installations, and no more running the ACL cron manually to repair them.
  • CKEditor Update – The latest version of the greatest embedded WYSIWYG editor.  Some great new features and sweet visual design (See more at http://ckeditor.com/about/features)

Okay, that was a long list of stuff.  I hope you found some interesting things to check out, and some really good reasons to update.  But for many of you, there is one more reason to consider…

End of Support for 5.x

With all of these new features, and the roadmap we’re planning for 2014, something’s gotta give.  We’ve been making noise all year about the end of support for GForge 5.x, but this time we mean it.  With the release of GForge 6.3 in mid-October, we will no longer provide direct support for 5.x installs.  If at all possible, you should make plans to upgrade, like, now.  Go get the 6.3 snapshot from the Customer repository, fire up a VM copy of your GForge instance and go through the upgrade process.  Let us know what questions, problems and tricks you come across, because we want the upgrade to go smoothly when it’s time.

Traits are Includes!

As part of the Great New Code To Come (some call it GForge6.3), we’ve been looking at several improvements to our technology stack, not the least of which will be moving to PHP 5.4 or 5.5. Traits are an interesting new construct, neither Interface nor Base Class, that I’ve been trying to factor into the Great New Code. I’ll admit that I’ve been kinda stumped about why traits are cool. Well, that’s because it turns out they’re probably not.

Benjamin Eberlei got things started with a good, objective analysis of the trait feature. Boiled down, his argument is that if you can derive the same benefits of a trait using a class and static method instead, then traits are really the same tool with a different handle. And I agree, but I think it’s even worse.  If Benjamin’s associative analysis is valid, then traits are actually just include/require statements with the trait {} wrapper around the functions.

Consider this example from the article:

Selection_037

Here’s Benjamin’s equivalent, rewritten for static methods:

Selection_038

And here’s the same thing as an include, without the trait/class wrapper:

-Untitled Document 1 - gedit_039

Yes, there’s extra plumbing in there, because PHP does not allow random includes in the middle of a class declaration.  And while it’s klunky to look at, it serves exactly the same purpose as a trait – shared implementation without inheritance (or aggregation, which IMO would be the right solution here).  Other than being klunky, it seems no less reusable, testable or readable than traits (but again, not as good as aggregation).

So: IF you hate this code, AND it’s functionally equivalent to traits, THEN maybe traits aren’t so great either.

Maybe I’ve missed the point, and I would love to hear about that.  I don’t think I have.  In fact, I think this is a good example of how sneaky and pernicious the “new” can really be.  How easily we can be dazzled by something slick that does the same job as the old tool we already have.  But relentlessly adding new things leads to cruft, to redundant and fragile features.  For example, I spent a couple of hours last week removing all of the ereg() calls from our code base, in favor of preg_match().  Over time, too much “new” can de-stabilize a solid product, as we have seen in PHP (and others, of course).

Instead of adding new (and arguably redundant) language features, maybe we should stay focused on harnessing what’s already there.  A great example for PHP would be the array_column function coming in 5.5, that lets us easily process the arrays that are central to PHP’s power.  This is code I’ve written before and will be happy to retire.  And it’s not a new feature, but a deeper usage of PHP’s powerful array handling.

Dynamic languages like PHP and Javascript have been incredibly popular for many years, but much of their usage has (unfortunately) conformed to Sturgeon’s Law.  With better understanding and application of the basic language features, we now have tools like Composer, Doctrine, and KnockoutJS – tools that largely insulate us from the Fractals of Bad Design.  I am more excited about PHP than I’ve been in several years, because these tools also drive me to better understand and apply those same language features.  I’m looking forward to Great New Code.