CoCKTaiL - a CCK Type Language

Djun Kim
2008
11
01

Abstract

In this post, we describe some of the issues facing large or complex software projects using Drupal's CCK content type extension facilities. We then propose a remedy in the form of CoCKTaiL, a high-level language for CCK types and associated UI elements and relations. Tools and potential applications will be described in future work.

Introduction

Drupal's Content Construction Kit (CCK) is a powerful feature, much loved by Drupal developers, who use it to rapidly define data types and relations for applications. Using a UI, developers can easily create new content types by specifying fields, UI widgets, relations, and more.

The ability to interactively create new types and supporting UI's is a great timesaver in the early stages of many Drupal development projects.

For larger projects, however, the nature of the design and implementation process for CCK types leads to difficulties in evolving and maintaining projects which include CCK components, and these difficulties contribute to the lifecycle cost of large Drupal applications.

Challenges include:

  • succinctly and precisely specifying CCK types
  • importing and exporting CCK types and data to and from Drupal instances
  • versioning changes to CCK configuration for Drupal instances
  • migrating data between different versions of Drupal applications, and/or into or out of Drupal sites
  • understanding and manipulating the data model for complex CCK based applications
  • validating CCK implementations against external requirements
  • rapidly, accurately, and repeatably creating and modifying CCK based type structures
  • automatically generating CCK types and supporting code from high-level descriptions
  • and many others...

Examples of issues with existing CCK

Hey, who changed the CCK definition?

If you've worked on a team building a site involving a lot of CCK types, you may have had the joy of discovering that a type definition has mysteriously changed. By the time you discover who made the change and why, and what the impact is on your code, half a day has gone by.

We need the CCK defs from site X, but with a new userref in all types and different titles for some of the types!

"Sure, no problem. We'll export the fields one by one, then import them and make the changes in place." Meanwhile, time passes...

I've been given a gorgeous ER diagram specifying all the types we need, their fields and relationships... now what?

Wouldn't it be nice to be able to translate that into a high-level specification that will generate the CCK for you?

Introducing CoCKTaiL

A possible solution to many of these issues would be a suitable 'high-level' representation of CCK types, for description, modeling, exchange, and design-build automation, written in a formal type language.

I have been working on such a language, which I have called CoCKTaiL (for CCK Type Language); one of my goals has been to ensure that CoCKTaiL:

  • allows full and precise definitions of CCK types, UI elements, etc.
  • is compact
  • is implementation language independent
  • is a textual representation (works with version control and text editing tools)
  • has a formally specified (context-free) syntax for machine generation and parsing
  • "null" options correspond to sensible defaults
  • is human readable and writable
  • is familiar? (what are analogs?)
  • is extensible, to handle extensions to CCK
    • add productions to grammar, snippets and rules to code generator
    • each CCK type module would include extensions (hooks?)
  • is well-supported by tools:
    • a parser
    • validator
    • CoCKTaiL blender (UML -> CoCKTaiL)
    • CCK type generator (CoCKTaiL -> CCK code)
    • CCK code de-compiler (CCK -> CoCKTaiL)
  • is a suitable target for automatic translation to/from UML: (E.g. UML -> CoCKTaiL -> CCK code, and roundtr ip?)

Examples

Here is an example what of a type definition specified via CoCKTaiL might look like. (Compare the ease of typing this to the time involved in generating this via the CCK user interface)


// CoCKTaiL Type definition for Client info
type 'Client info' {
  description 'Basic information about a client';
  title_label 'Name';
  body_label 'Description';
  status false;
  comment 2;

  group 'Contact info' {
    style fieldset;
    description 'Contact info for Client';
    display_label inline;
    field text 'Street address 1' {}
    field text 'Street address 2' {}
    field phone 'Home phone' {}
    field phone 'Mobile phone' {}
  }

  field textarea 'Notes' {}
}

Next steps

In my next post I will describe a formal (LALR(1)) grammar which describes CoCKTaiL, and some tools which I am using to develop a validator, code generator, and some other tools.

Go on, do it , you know you

Go on, do it , you know you want to, just model it in UML :)

My secret motivation has been revealed!

It will be interesting to see how well UML and the current toolsets can be applied to our situations....

Looks awesome

This looks really nice. I would love to use something like that. I'm right now working on a project facing a very complex nodetype -> I have actually not used cck but made it all manually as I need full control. This though looks veeeeeery nice.

Thanks

Check back soon, perhaps we will have something that you will be able to use for your next project. It would be great to have your feedback.

Regards, Djun

Project is almost over

hi. Unfortunately the project is in the last stage, and I hope there won't be any changes to the content type anymore.

why create a language?

This looks interesting. But I'm not sure that we need to be a new language. The Drupal way seems to be to define things using arrays.

Are you proposing a new utility that is totally separate from Drupal, simply for inporting and exporting CCK types?

I'd like to see a Drupal module, with an import/export feature, where the definitions could be pasted into php. This would lend itself more to an array syntax rather than a language definition.

I've done a lot of language work in the past and I understand parsers. But we have gotten pretty far in Drupal without them. I even avoided it in coder, opting instead for lots of regex checks.

So, I'm wondering why you think that the CCK definition needs to be a language. It feels very un-Drupalish.

Thanks!

A good question

Thanks for taking the time to read and reply to my posts.

That's a good question, and one that I've wrestled with a bit. As you can probably tell from what follows, I have some doubts about whether there really needs to be a language for this. In the end I've convinced myself that it's worthwhile to at least experiment with this.

This looks interesting. But I'm not sure that we need to be a new language. The Drupal way seems to be to define things using arrays.

There are a couple of motivations for going to a 'little language', rather than sticking to the existing CCK representation. I wanted a representation that was

  1. a little more language/platform agnostic.
  2. higher level, to eliminate 'minor details'
  3. suitable for use as a modeling language, or intermediary for UML.
  4. easy for humans to read, edit, and version

Basically, 'easier for humans, still easy for machines' is what I'm aiming for. Eventually (soon?) I'm hoping to be able to model my types with a UML tool, convert to CoCKTaiL, and then have the CCK import-ready PHP code generated.

Are you proposing a new utility that is totally separate from Drupal, simply for inporting and exporting CCK types?

There's lots of places this could go. At the moment, I'm working on a couple of stand alone utilities. Clearly it would be easy to incorporate the parser into a module to accept CoCKTaiL for import rather than the CCK php code.

But perhaps the most interesting application would be 'Enterprise integration' - a way for organizations with heterogeneous, complex systems to start to see Drupal as a component for cost-effective, rapidly deployable solutions to integrate with or replace existing systems.

I'd like to see a Drupal module, with an import/export feature, where the definitions could be pasted into php. This would lend itself more to an array syntax rather than a language definition.

Right! Import from CoCKTaiL should be easy. Export to CoCKTaiL, I'll need to look, but it doesn't seem too difficult.

I've done a lot of language work in the past and I understand parsers. But we have gotten pretty far in Drupal without them. I even avoided it in coder, opting instead for lots of regex checks.

PHP has the cool feature of having its tokenizer available, which provides a lot of muscle to certain applications. But the (surprising, to me) truth is that for PHP there haven't been good parser tools available until fairly recently . Maybe people didn't take it seriously? To me, it now looks like it's pretty much becoming what C++ was hoping to be. I expect we'll see more good tools for PHP.

Another aspect is that a lot of people have abandoned writing parsers (which can be a pain) in favour of using XML and transformations. This is great as long as humans don't have to read or understand the language :)

So, I'm wondering why you think that the CCK definition needs to be a language. It feels very un-Drupalish.

I think that the present CCK representation is great - as an internal, or even Drupal to Drupal representation. It's direct, does the job, and is close to the objects it's representing. I'm writing CoCKTaiL as an experiment in representation in what might be thought of as a 'modeling and messaging' layer, which is by definition outside of Drupal. So, hopefully it is not inappropriate that CoCKTaiL be somewhat un-Drupalish, as it's kind of an adaptor, if you will.

I just made that up :) I hope it's not too hand-wavey.

Thanks!

BTW, I've started with a bunch of code metrics and static analysis work, too. I'd love to chat with you about coder module and your thoughts for future directions.

Cheers, Djun

something is needed, bu

When I get in a cycle of launching new sites and pass off the website to the next people down the line to do the launch, I'd rather not have them learning a new language. Sure, this solution might be great for people who are programmer-minded, but I think ultimately the right way is to create more flexible GUI options for configuration.

Both approaches are needed, I think

Hi deekayen, I appreciate your point about not wanting to learn yet another language.

I agree that better UIs and tools are needed. Its possible that CoCKTaiL could be a part of such a solution, too - having something that might be a target for generation from UML might make sense to people who like those ways of working. Also, the GUIs might manipulate objects which generate CoCKTaiL (hence delegating responsibility for the nitty-gritty details)

In any case, I hope that this is closer to the kind of representation one might scribble on the back of a napkin, i.e., a little more intuitive for people. Hopefully the strength of this will be to allow more interaction and iteration at the design level, with an easy transition to code from the work products of the design activities.

Forgive me for ranting, but...

I can understand why you would find that PHP is to unflexible for this task, but things like XML or YAML or even JSON could easily be adapted for this, and in doing so, you could make the world a better place by not adding to the heaps of buggy parsers for buggy utility language that languishes on millions of servers around the world.

Please consider it.

I'd considered this

and I think that there's a place for this, also, perhaps.

In particular, if machine-to-machine messages were the only thing I was considering, there would be no question.

Actually, one of my motivations was to provide a target for translation from XML (or other machine domain) based representations.

However, I felt it was important to have something that is human readable. I don't know YAML or JSON - if you can suggest a way to achieve this using one or the other of those, that would be cool!

Regards, Djun

YAML is it

YAML is really nice and readable for this. See this article - devzone.zend.com/article/2585-Using-YAML-With-PHP-and-PECL. There are php classes we can include instead of using the PHP extension that they recommend there.

Cool, I will have a look at this

Thanks for the reference, Moshe.

YAML's been one of those 'hmm, must look at this one of these days' things for me, and I guess today is the day :)

Another reference to YAML

For those following along, here's another reference, this time to the YAML spec

Zend Framework... UML buzzword thoughts

Actually, the Zend Framework has adapters (currently for XML and "INI" style config files), which could prove an alternative.

However, before we can start saying that we are "applying UML" we need to see how the evolution of dividing the data persistence of the entities from the business objects controlled by the system will evolve. CCK in itself tends to tie together the form binding, the business objects and the database persistence all together, which doesn't really allow much more than a ERD to CCK translation; I wouldn't call that UML.

However, I have had ERD semi-complex diagrams which needed to be implemented in CCK, and had to do the whole thing by hand, of course, so in that sense this would be a boon.

And I thoroughly agree against the idea of a parser; go YAML or XML or both. Arrays are "the Drupal Way" for ojects in memory, but not the configuration language, necessarily.

Victor Kane
http://awebfactory.com.ar

JSON

JSON is at http://www.json.org/. There are several PHP parsers and it's native in PHP 5.2.0. Worth a look.

Great Name!

I think it was about three years ago that I suggested that CCK itself be called Cocktail.

http://lists.drupal.org/archives/development/2005-03/msg00063.html

So I obviously love the name. I also love that you named part of the package a Blender.

As for the module iteself:
It shouldn't be sacreligious or un-drupal like to NOT use a structured array. In fact I have commented a few times that Drupal's structured array "objects" are a bit of a pain since my IDE can't make suggestions for the next element I might consider adding to [say] a form.

But this solution has the same symptom. It does also seem as though some wheels have been re-invented here. I will reserve judgment until I read Part 2 of this article - still I can't help but like it because of the name alone.

Thanks, Andre

Your observations about having a structure that an IDE can provide help with are useful.

I'll have to look into this. Do you have a favorite IDE? Komodo, Eclipse, Emacs, Vim...

The syntax is somewhat C-like, so things like delimiter balancing and highlighting should work, as well as string editing. There are a few reserved words, and identifiers for field types could also be thought of as an enumerated type. Everything else is basically lists of attributes.

IDE hinting is definitely something I'll have to play with.... it's probably a good way to judge how 'natural' the language will feel to programmers.

Thanks!

the date on this post is

the date on this post is fubar and it's messing up http://drupal.org/planet by keeping it constantly at the top of the list. please fix.

Hmm...

I'd change the post date from when it was originally posted to the date I published it.

It doesn't seem to be stuck anymore on planetDrupal, as far as I can tell.

Lemme know if it still seems to be fubar.

Ruby

Djun,

This is cool stuff, and I see where it will be useful.

However, it's also a demonstration that PHP simply doesn't provide powerful enough abstractions. Just to play devil's advocate, here's how your DSL might look as implemented in Ruby (without necessitating any parsing):

// CoCKTaiL Type definition for Client info (in Ruby) class ClientInfo type 'Client info' description 'Basic information about a client' title_label 'Name' body_label 'Description' status false comment 2 group 'Contact info' do style :fieldset description 'Contact info for Client' display_label :inline field :text, 'Street address 1' field :text, 'Street address 2' field :phone, 'Home phone' field :phone, 'Mobile phone' end field :textarea, 'Notes' end

(Anyhow, I'm going to need to have another look at phplemon - it wasn't usable the last time I looked. More parsing tools for PHP are most welcome...)