Improving Drupal code quality: Using Data Transfer Objects instead of arrays

From my experience over the years, Drupal developers tend to use arrays everywhere. In some situations, that's perfectly fine, but in others, it's not ideal at all. A better solution than using arrays is to use Data Transfer Objects (DTOs). Let's look at a concrete example to see how this works.

Let's take the example of adding an item to a queue. A typical Drupal developer might write something like this:

function qdrant_sync_add_to_queue(string $queue_name, EntityInterface $entity): void {
  /** @var \Drupal\Core\Queue\QueueFactoryInterface $queue_factory */
  $queue_factory = \Drupal::service('queue');
  $queue = $queue_factory->get($queue_name);
  
  $item = [
    'entity_type' => $entity->getEntityTypeId(),
    'entity_bundle' => $entity->bundle(),
    'entity_id' => $entity->id(),
    'entity_uuid' => $entity->uuid(),
  ];
  
  $queue->createItem($item);
}

While at first glance the code might look fine, what I don't like about using arrays is that we can't be certain about the type of each value within the array. There's also no validation — for instance, instead of writing 'entity_type,' we could accidentally write 'entity_typ,' and we wouldn't catch the error until runtime.

Another drawback of using arrays is that they're hard to read, and we can't always be sure if we've included all the required values or if we've missed something. Their self-documentation is extremely poor. Additionally, IDEs offer much better autocomplete and tooling support when working with objects compared to arrays.

Now, let's rewrite the code above to use a Data Transfer Object (DTO):

function qdrant_sync_add_to_queue(string $queue_name, EntityInterface $entity): void {
  /** @var \Drupal\Core\Queue\QueueFactoryInterface $queue_factory */
  $queue_factory = \Drupal::service('queue');
  $queue = $queue_factory->get($queue_name);
  
  $item = QdrantSyncQueueItem::fromEntity($entity);
  
  $queue->createItem($item);
}

Our DTO class would look something like this:

<?php

declare(strict_types=1);

namespace Drupal\qdrant_sync;

use Drupal\Core\Entity\EntityInterface;

/**
 * Data Transfer Object for queue items.
 */
final class QdrantSyncQueueItem {

  /**
   * Constructs a new QdrantSyncQueueItem instance.
   *
   * @param string $entityType
   *   The entity type ID.
   * @param string $entityBundle
   *   The entity bundle.
   * @param int|string $entityId
   *   The entity ID.
   * @param string $entityUuid
   *   The entity UUID.
   */
  public function __construct(
    private readonly string $entityType,
    private readonly string $entityBundle,
    private readonly int|string $entityId,
    private readonly string $entityUuid,
  ) {}
  
  /**
   * Creates a DTO from an entity.
   *
   * @param \Drupal\Core\Entity\EntityInterface $entity
   *   The entity to create the DTO from.
   *
   * @return self
   *   A new DTO instance.
   */
  public static function fromEntity(EntityInterface $entity): self {
    return new self(
      $entity->getEntityTypeId(),
      $entity->bundle(),
      $entity->id(),
      $entity->uuid(),
    );
  }
  
  /**
   * Gets the entity type.
   *
   * @return string
   *   The entity type ID.
   */
  public function getEntityType(): string {
    return $this->entityType;
  }
  
  /**
   * Gets the entity bundle.
   *
   * @return string
   *   The entity bundle.
   */
  public function getEntityBundle(): string {
    return $this->entityBundle;
  }
  
  /**
   * Gets the entity ID.
   *
   * @return int|string
   *   The entity ID.
   */
  public function getEntityId(): int|string {
    return $this->entityId;
  }
  
  /**
   * Gets the entity UUID.
   *
   * @return string
   *   The entity UUID.
   */
  public function getEntityUuid(): string {
    return $this->entityUuid;
  }
  
  /**
   * Converts the DTO to an array.
   *
   * @return array
   *   The DTO data as an array.
   */
  public function toArray(): array {
    return [
      'entity_type' => $this->entityType,
      'entity_bundle' => $this->entityBundle,
      'entity_id' => $this->entityId,
      'entity_uuid' => $this->entityUuid,
    ];
  }
}

Although using a DTO instead of an array makes the code longer, it becomes more readable, easier to maintain, and significantly less error-prone, especially in larger and more complex projects. The additional structure introduced by the DTO helps clearly define rules, types, and validations for the data, which simplifies team collaboration and allows for faster detection of issues during development. In the long run, this approach pays off by reducing the need for bug fixes and refactoring.

The toArray() method is optional. If no parts of your codebase actually call or rely on this method, you can safely omit it. Another important point is that, in our example, this class is used for a queue, which in Drupal must be serializable. This is because queue data is typically stored in a database. For this class, PHP will automatically handle serialization and unserialization correctly, but if you encounter issues, you'll need to implement the magic methods __serialize() and __unserialize(). This is usually not necessary unless you want to control which properties should be serialized, perform some transformations, or handle sensitive data like passwords.

Using a factory method like fromEntity is superior to direct constructor usage for several key reasons. Factory methods provide more expressive and self-documenting code, with names that clearly communicate their purpose - fromEntity($entity) immediately tells developers they're converting from an entity, while constructors are limited to just the class name. The factory method pattern encapsulates all entity-to-DTO conversion logic in one place, making future changes or additions to the conversion process much simpler to implement. Using factory methods results in cleaner client code by hiding complex object creation details and reducing repetitive code throughout your application.

Here's how you would create instances using constructors instead:

// Direct constructor usage without named parameters.
$dto = new QdrantSyncQueueItem(
  $entity->getEntityTypeId(),
  $entity->bundle(),
  $entity->id(),
  $entity->uuid()
);

// Constructor with named parameters (PHP 8+).
$dto = new QdrantSyncQueueItem(
  entityType: $entity->getEntityTypeId(),
  entityBundle: $entity->bundle(),
  entityId: $entity->id(),
  entityUuid: $entity->uuid()
);

// Constructor when you have raw data.
$dto = new QdrantSyncQueueItem(
  entityType: 'node',
  entityBundle: 'article',
  entityId: 123,
  entityUuid: '550e8400-e29b-41d4-a716-446655440000'
);

In our example with the queue, now when we look at how the queue worker would function, our processItem() method could look like this:

/**
 * {@inheritdoc}
 */
public function processItem(QdrantSyncQueueItem $item) {
  $storage = $this->entityTypeManager->getStorage($item->getEntityType());
  $entity = $storage->load($item->getEntityId());
  ...
  ...
}

Let's take a look at another example of using a DTO. I used something like this when working with webhooks, where I created a class for the data received via the webhook. The class is fairly simple:

<?php

namespace Drupal\MY_MODULE;

/**
 * Value object for webhook responses.
 */
final class WebhookResponse {

  /**
   * Constructs a WebhookResponse object.
   *
   * @param string $message
   *   The response message.
   * @param int $status
   *   The HTTP status code.
   */
  public function __construct(
    private readonly string $message,
    private readonly int $status = 200,
  ) {}

  /**
   * Creates a successful response.
   *
   * @param string $message
   *   The success message.
   *
   * @return static
   *   A new WebhookResponse instance.
   */
  public static function success(string $message): self {
    return new self($message, 200);
  }

  /**
   * Creates an error response.
   *
   * @param string $message
   *   The error message.
   * @param int $status
   *   The HTTP status code (defaults to 400).
   *
   * @return static
   *   A new WebhookResponse instance.
   */
  public static function error(string $message, int $status = 400): self {
    return new self($message, $status);
  }

  /**
   * Gets the response message.
   *
   * @return string
   *   The response message.
   */
  public function getMessage(): string {
    return $this->message;
  }

  /**
   * Gets the status code.
   *
   * @return int
   *   The HTTP status code.
   */
  public function getStatus(): int {
    return $this->status;
  }

}

Then we can handle the webhook and return a result type-hinted with the WebhookResponse class:

public function handleWebHook(string $signature): WebhookResponse {
  ...
  ...
  try {
    ...
    ...
    return WebhookResponse::success((string) $message);
  }
  catch (UnexpectedDataException $e) {
    return WebhookResponse::error('Invalid payload: ' . $e->getMessage());
  }
  catch (\Exception $e) {
    return WebhookResponse::error('Error processing webhook: ' . $e->getMessage(), 500);
  }
}

DTOs offer a more robust and maintainable alternative to arrays in Drupal development, providing type safety, improved IDE support, and clearer code structure. By encapsulating data and its associated behavior, DTOs help prevent common errors and make code more self-documenting, ultimately leading to more reliable applications. Whether you're working on a small module or a large enterprise application, considering DTOs for your data structures can significantly improve your code quality and development experience.

About the Author

Goran Nikolovski is a web and AI developer with over 10 years of expertise in PHP, Drupal, Python, JavaScript, React, and React Native. He founded this website and enjoys sharing his knowledge.