Batch processing and update hooks in Drupal 8 and 9

Performing operations in an update hook that you know will take a lot of time or require a lot of processing power should be done in a batch job. Let's say that you have a thousand articles and that you want to change a field value for all of them. Loading all the nodes in a single process is not a smart thing to do. You could write a direct database query to speed things up, but bypassing the Entity API is a bad idea unless you really know what you are doing.

An example of bypassing the Entity API can be found in the Drupal Commerce module. Take a look at the following code:

commerce_order.install#L94

The Entity API is bypassed and database tables are updated by using a regular SQL query. As I said this is just fine if you know what you are doing, but you could also use batch processing to do expensive operations.

To indicate that you want to use the Batch API in your update hook you should use the $sandbox parameter. The $sandbox parameter acts as the batch context parameter. Setting $sandbox['#finished'] to a value between 0 and 1 indicates the percentage of completion, and setting this value to 1 means that you are done with your operations.

function MY_MODULE_update_8001(&$sandbox) {
   if (!isset($sandbox['total'])) {
     $uids = \Drupal::entityQuery('user')
       ->execute();
     $sandbox['total'] = count($uids);
     $sandbox['current'] = 0;

     if (empty($sandbox['total'])) {
       $sandbox['#finished'] = 1;
       return;
     }
   }

   $users_per_batch = 25;
   $uids = \Drupal::entityQuery('user')
     ->range($sandbox['current'], $users_per_batch)
     ->execute();
   if (empty($uids)) {
     $sandbox['#finished'] = 1;
     return;
   }

   foreach ($uids as $uid) {
     $user = \Drupal\user\Entity\User::load($uid);
     // ... do something with the loaded user entity ...
     $user->save();
     $sandbox['current']++;
   }

   \Drupal::messenger()
     ->addMessage($sandbox['current'] . ' users processed.');

   if ($sandbox['current'] >= $sandbox['total']) {
     $sandbox['#finished'] = 1;
   } 
   else {
     $sandbox['#finished'] = ($sandbox['current'] / $sandbox['total']);
   }
}

If you are making a query using the Entity query and you are not seeing what you expect to see then it is probably related to the access check, so you might want to disable the access check:

$order_storage = \Drupal::entityTypeManager()->getStorage('commerce_order');
$query = $order_storage->getQuery()->accessCheck(FALSE);

Make sure to disable the access check only if it's really needed.

About the Author

Goran Nikolovski is a web and AI developer with over 10 years of expertise in PHP, Drupal, Python, JavaScript, React, and React Native. He founded this website and enjoys sharing his knowledge.