Domain Specific Pidgin 10

Posted by Piers Cawley Wed, 08 Aug 2007 07:43:00 GMT

So, I’m busily writing an article about implementing an embedded little language in Ruby. It’s not something that’s going to need an entirely new parser, it borrows Ruby’s grammar/syntax but does some pretty language like things to the semantics and ends up feeling far more like at declarative language than the usual Ruby imperative OO style.

Because I tend to chromatic’s view of many ruby programmers’ ability to cry “Wolf!” “DSL!”, I don’t want to claim that it’s a full blown Domain Specific Language, but it’s sufficiently language like that ‘API’ doesn’t seem to fit as a description either.

Then it hit me… it’s a pidgin.

A pidgin can be thought of as a mashup of two languages, taking vocabulary from both its parents and its grammar (usually simplified) from one parent. Historically, pidgins have arisen to help with trade and colonization; grammars have tended to be lifted and simplified from the ‘native’ language and then spiked with words from the colonizing language with a leavening of native words where they make sense. All quite politically incorrect nowadays, but they served their purpose. Pidgins are, by their nature, domain specific; fine if you wanted to talk trade or order your coolies about, but not what you’d write poetry in. Poetry tended to get written in the creoles that evolved from some pidgins. A creole is a general purpose language, with a grammar of its own; they seem to evolve from pidgin, getting invented by the kids of parents who speak that pidgin.

In my little language, much of the vocabulary pinched from my problem domain’s language and the grammar and some terminology is lifted from ruby. Casting the problem domain as the colonial power and ruby as the native language, it’s obvious that I’ve invented a pidgin language.

Let’s embrace that term. We’re not writing domain specific languages, we’re writing pidgins. ActiveRecord’s family of class methods isn’t a DSL, it’s database pidgin. RSpec is a testing pidgin, Parsec is a parsing pidgin, so are any number of APIs that make their host language feel like a new one.

Updates

In the comments, Aristotle Pagaltzis points out that Parse::RecDescent isn’t a parsing pidgin because it uses a full blown parser (itself, obviously) to parse any grammar declarations.

Comments

Leave a response

  1. Avatar
    Ben about 9 hours later:

    Interesting – I’ve been calling these (AR, RSpec, etc.) domain specific dialects for a while now. I think dialect might be a better term, given that the main distinguishing characteristic is the vocabulary, not the grammar (pidgins, as I recall, generally don’t take the grammar of one parent – they usually follow an extremely simplified grammar that is remarkably consistent regardless of the parent languages (which is often used as an argument for Universal Grammar, fwiw)).

  2. Avatar
    Piers Cawley about 10 hours later:

    We’re fast approaching “How many angels…?” territory here, but I’d be inclined to call ActiveRecord’s family of class methods taking hash arguments with meaningful keys a dialect, but reckon that RSpec is fast approaching a ‘real’ language or pidgin.

  3. Avatar
    chromatic about 12 hours later:

    “Dialect” reads better to me, but “pidgin” is catchy too.

  4. Avatar
    Aristotle Pagaltzis about 16 hours later:

    I think Parse::RecDescent qualifies as an actual DSL – external one even. After all, it actually parses grammars itself. Granted, part of the grammar is code that’s passed thru straight to Perl, but you can write a perfectly useful P::RD grammar without any Perl in it.

    To stay in your image, it’s more like the speech of a second-generation immigrant who can speak the native tongues of both his parents and the locals – like me and the other Greek kids I knew while growing up in Germany. When in an audience that understands both languages, such speakers will tend to seamlessly flip-flop from one language to the other in mid-sentence depending on what rolls off the tongue easier in the moment.

    (And I know this is not limited to my crowd. The same tendency is apparent in second-gen’ers with all combinations of language; whether emigrating from Greece, Italy or Turkey to Germany, or from Greece to Germany, the US/Australia/Canada or France, or from Mexico to the US, it is always true. The immigrants’ kids know both languages almost equally well and switch between them casually in mid-sentence.)

  5. Avatar
    Piers Cawley about 22 hours later:

    Aristotle: You are, of course, correct. My memory was playing me false and Parse::RecDescent does indeed involve a parser. In my memory it was one more of those Conway modules which feel like another language but still parsed as Perl.

  6. Avatar
    David Romano 7 days later:

    Piers: Do you think your pidgin might just actually be a register? (I know, “register” isn’t as catchy as “pidgin” :-) It seems that most DSLs use vocabulary that simply expands the native language’s patterns of expression. This is no different than inventing mathematical jargon in English to express the problem domain of mathematics. How different is your embedded little language’s syntax/grammar from Ruby’s?

    Aristotle: Your description of flip-flopping back and forth between two languages mid-sentence is termed code switching in linguistics. As you said, it’s a very common phenomenon, but it is not restricted to those who speak both languages fluently.

  7. Avatar
    Piers Cawley 7 days later:

    Do you think your pidgin might actually be a register?

    I don’t think so, mostly because it’s pretty much entirely declarative. It’s for attribute declaration and (currently) looks something like:

    class Whatever
      attributes {
        foo.default = 42
        bar.should.match /.../
        baz.should.ensure {|val| val > foo}
        quux {
          default = 90
          should.ensure {|val| (val % 10).zero?
        }
      }
    end
    

    The implementation leans heavily on evaluating the blocks involved in odd scopes and the semantics aren’t generally what you’d consider to be the usual ones for Ruby. For instance, the line foo.default = 42 has the effect of generating code along the lines of

    def foo
      @foo = 42 if instance_variable_defined? :@foo
      return @foo
    end
    

    which isn’t really the normal semantics you’d expect for assignment. Meanwhile the block in baz.should.ensure {|val| val > foo} gets evaluated in the context of an instance of the class before any assignment is made to baz via the accessor method.

    I’m not entirely sure pidgin is a particularly good word either; I’m just not sure I can think of a better one.

  8. Avatar
    David Romano 8 days later:

    Thanks for showing me what it looks like and means. Now that I see it, it’s definitely not a register.

    I’m not entirely sure pidgin is a particularly good word either; I’m just not sure I can think of a better one.

    Me neither. :-/

  9. Avatar
    Warner Onstine 6 months later:

    I’m not sure that pidgin is the right term here, specifically from a linguistic point of view. While Pidgin languages do come about because of one or more groups of people coming together and having to communicate with each other they are indeed “languages” in their own right. More specifically these languages evolve over time into their own full-blown language, where the words being used no longer mean what they did in either language.

    There was a really good story I either read or heard on Pidgin Languages (it might have come from the book “The Language Instinct” by Pinker – very good book).

    I’m going to be sticking with Internal DSLs for my own reference as I’m not very comfortable with labeling these as “Pidgin Languages” in my eyes they aren’t really. Just my .02.

  10. Avatar
    Adrian about 1 year later:

    Pidgins are created by the first generation, creole is the term for when is is learnt by the next generation as a native language. As a perlist indulging a linguistic hobby, I loved the crossover.

    I also missed Pier’s humour when he left Perl 6 news – so it’s good to catch up.

Comments