Just A Summary

Piers Cawley Practices Punditry

Domain Specific Pidgin

So, I’m busily writing an article about implementing an embedded little language in Ruby. It’s not something that’s going to need an entirely new parser, it borrows Ruby’s grammar/syntax but does some pretty language like things to the semantics and ends up feeling far more like at declarative language than the usual Ruby imperative OO style.

Because I tend to chromatic’s view of many ruby programmers’ ability to cry “Wolf!”DSL!”, I don’t want to claim that it’s a full blown Domain Specific Language, but it’s sufficiently language like that ‘API’ doesn’t seem to fit as a description either.

Then it hit me… it’s a pidgin.

A pidgin can be thought of as a mashup of two languages, taking vocabulary from both its parents and its grammar (usually simplified) from one parent. Historically, pidgins have arisen to help with trade and colonization; grammars have tended to be lifted and simplified from the ‘native’ language and then spiked with words from the colonizing language with a leavening of native words where they make sense. All quite politically incorrect nowadays, but they served their purpose. Pidgins are, by their nature, domain specific; fine if you wanted to talk trade or order your coolies about, but not what you’d write poetry in. Poetry tended to get written in the creoles that evolved from some pidgins. A creole is a general purpose language, with a grammar of its own; they seem to evolve from pidgin, getting invented by the kids of parents who speak that pidgin.

In my little language, much of the vocabulary pinched from my problem domain’s language and the grammar and some terminology is lifted from ruby. Casting the problem domain as the colonial power and ruby as the native language, it’s obvious that I’ve invented a pidgin language.

Let’s embrace that term. We’re not writing domain specific languages, we’re writing pidgins. ActiveRecord’s family of class methods isn’t a DSL, it’s database pidgin. RSpec is a testing pidgin, Parsec is a parsing pidgin, so are any number of APIs that make their host language feel like a new one.


In the comments, Aristotle Pagaltzis points out that Parse::RecDescent isn’t a parsing pidgin because it uses a full blown parser (itself, obviously) to parse any grammar declarations.

Published on Wed, 08 Aug 2007 02:43:00 GMT by Piers Cawley under , .

If you liked this article you can add me to Twitter
  • Gravatar

    By Ben Wed, 08 Aug 2007 11:48:07 GMT

    Interesting – I’ve been calling these (AR, RSpec, etc.) domain specific dialects for a while now. I think dialect might be a better term, given that the main distinguishing characteristic is the vocabulary, not the grammar (pidgins, as I recall, generally don’t take the grammar of one parent – they usually follow an extremely simplified grammar that is remarkably consistent regardless of the parent languages (which is often used as an argument for Universal Grammar, fwiw)).

  • Gravatar

    By Piers Cawley Wed, 08 Aug 2007 12:25:42 GMT

    We’re fast approaching “How many angels…?” territory here, but I’d be inclined to call ActiveRecord’s family of class methods taking hash arguments with meaningful keys a dialect, but reckon that RSpec is fast approaching a ‘real’ language or pidgin.

  • Gravatar

    By chromatic Wed, 08 Aug 2007 14:47:12 GMT

    “Dialect” reads better to me, but “pidgin” is catchy too.

  • Gravatar

    By Aristotle Pagaltzis Wed, 08 Aug 2007 18:43:50 GMT

    I think Parse::RecDescent qualifies as an actual DSL – external one even. After all, it actually parses grammars itself. Granted, part of the grammar is code that’s passed thru straight to Perl, but you can write a perfectly useful P::RD grammar without any Perl in it.

    To stay in your image, it’s more like the speech of a second-generation immigrant who can speak the native tongues of both his parents and the locals – like me and the other Greek kids I knew while growing up in Germany. When in an audience that understands both languages, such speakers will tend to seamlessly flip-flop from one language to the other in mid-sentence depending on what rolls off the tongue easier in the moment.

    (And I know this is not limited to my crowd. The same tendency is apparent in second-gen’ers with all combinations of language; whether emigrating from Greece, Italy or Turkey to Germany, or from Greece to Germany, the US/Australia/Canada or France, or from Mexico to the US, it is always true. The immigrants’ kids know both languages almost equally well and switch between them casually in mid-sentence.)

  • Gravatar

    By Piers Cawley Thu, 09 Aug 2007 01:11:01 GMT

    Aristotle: You are, of course, correct. My memory was playing me false and Parse::RecDescent does indeed involve a parser. In my memory it was one more of those Conway modules which feel like another language but still parsed as Perl.

  • Gravatar

    By David Romano Thu, 16 Aug 2007 00:12:03 GMT

    Piers: Do you think your pidgin might just actually be a register? (I know, “register” isn’t as catchy as “pidgin” :-) It seems that most DSLs use vocabulary that simply expands the native language’s patterns of expression. This is no different than inventing mathematical jargon in English to express the problem domain of mathematics. How different is your embedded little language’s syntax/grammar from Ruby’s?

    Aristotle: Your description of flip-flopping back and forth between two languages mid-sentence is termed code switching in linguistics. As you said, it’s a very common phenomenon, but it is not restricted to those who speak both languages fluently.

  • Gravatar

    By Piers Cawley Thu, 16 Aug 2007 01:27:42 GMT

    Do you think your pidgin might actually be a register?

    I don’t think so, mostly because it’s pretty much entirely declarative. It’s for attribute declaration and (currently) looks something like:

    class Whatever
      attributes {
        foo.default = 42
        bar.should.match /.../
        baz.should.ensure {|val| val > foo}
        quux {
          default = 90
          should.ensure {|val| (val % 10).zero?

    The implementation leans heavily on evaluating the blocks involved in odd scopes and the semantics aren’t generally what you’d consider to be the usual ones for Ruby. For instance, the line foo.default = 42 has the effect of generating code along the lines of

    def foo
      @foo = 42 if instance_variable_defined? :@foo
      return @foo

    which isn’t really the normal semantics you’d expect for assignment. Meanwhile the block in baz.should.ensure {|val| val > foo} gets evaluated in the context of an instance of the class before any assignment is made to baz via the accessor method.

    I’m not entirely sure pidgin is a particularly good word either; I’m just not sure I can think of a better one.

  • Gravatar

    By David Romano Thu, 16 Aug 2007 04:09:40 GMT

    Thanks for showing me what it looks like and means. Now that I see it, it’s definitely not a register.

    I’m not entirely sure pidgin is a particularly good word either; I’m just not sure I can think of a better one.

    Me neither. :-/

  • Gravatar

    By Warner Onstine Wed, 13 Feb 2008 14:05:45 GMT

    I’m not sure that pidgin is the right term here, specifically from a linguistic point of view. While Pidgin languages do come about because of one or more groups of people coming together and having to communicate with each other they are indeed “languages” in their own right. More specifically these languages evolve over time into their own full-blown language, where the words being used no longer mean what they did in either language.

    There was a really good story I either read or heard on Pidgin Languages (it might have come from the book “The Language Instinct” by Pinker – very good book).

    I’m going to be sticking with Internal DSLs for my own reference as I’m not very comfortable with labeling these as “Pidgin Languages” in my eyes they aren’t really. Just my .02.

  • Gravatar

    By Adrian Fri, 16 Jan 2009 04:32:41 GMT

    Pidgins are created by the first generation, creole is the term for when is is learnt by the next generation as a native language. As a perlist indulging a linguistic hobby, I loved the crossover.

    I also missed Pier’s humour when he left Perl 6 news – so it’s good to catch up.

Comment Domain Specific Pidgin

Trackbacks are disabled

Powered by Publify – Thème Frédéric de Villamil | Photo Glenn