Handling errors right can be tricky and it can be even trickier in gRPC. The current version of the gRPC only has limited built-in error handling based on simple status codes and metadata. In this article, we will see the limitations of gRPC error handling and how to overcome and build a robust error handling framework. In the next article, we will examine how to handle errors in RestFul APIs using Spring Boot.
Code Example
The working code example of this article is listed on GitHub . To run the example, clone repository, and import grpc-spring-boot as a project in your favorite IDE.
The code example consists of two microservices –
- Product Gateway – acts as an API Gateway (client of Product Service) and exposes REST APIs (Gradle module product-api-gateway)
- Product Service – exposes gRPC APIs (Gradle module product-service)
There is a 3rd Gradle module, called commons, which contains common exceptions consumed by both Product Gateway Service and Product Service.
You can start these services from IDE by calling the main method of ProductGatewayApplication
and ProductApplication
respectively.
You can test the application by calling Product Gateway Service API as :
curl --location --request GET 'http://localhost:8080/products/32c29935-da42-4801-825a-ac410584c281' \
--data-raw ''
Error handling in gRPC
By default, gRPC relies heavily on status code for error handling. But this approach has certain drawbacks. Let’s try to understand by example.
In our sample application, the server-side Product Service exposes a gRPC Service getProduct
. This API fetches Product
from ProductRepository
and returns the response back to the client as:
public void getProduct(
GetProductRequest request, StreamObserver<GetProductResponse> responseObserver) {
String productId = request.getProductId();
var product = productRepository.get(productId);
var response =
GetProductResponse.newBuilder()
.setName(product.getName())
.setDescription(product.getDescription())
.setPrice(product.getPrice())
.setUserId(product.getUserId())
.build();
responseObserver.onNext(response);
responseObserver.onCompleted();
log.info("Finished calling Product API service..");
}
ProductRepository
fetches data from productStorage
and returns Product and throws an error if Product
is not found as:
public Product get(String productId) {
var product = Optional.ofNullable(productStorage.get(productId));
return product.orElseThrow(() -> new ResourceNotFoundException("Product ID not found"));
}
You may argue that why do we need to throw a custom exception, why can’t we throw gRPC specific StatusRunTimeException as
product.orElseThrow(() -> Status.NOT_FOUND.withDescription("Product ID not found").asRuntimeException());
The biggest benefit is the separation of concern. You don’t want to pollute business logic with gRPC specific code, which belongs to the transport(API) layer.
The responsibility of the client application (Product Gateway Service) is to call the server application and convert the received response to the domain object. In case of error, it simply wraps the error in domain-specific exception, as ServiceException(error.getCause())
, and throws to be handled upstream.
//Client call
public Product getProduct(String productId) {
Product product = null;
try {
var request = GetProductRequest.newBuilder().setProductId(productId).build();
var productApiServiceBlockingStub = ProductServiceGrpc.newBlockingStub(managedChannel);
var response = productApiServiceBlockingStub.getProduct(request);
// Map to domain object
product = ProductMapper.MAPPER.map(response);
} catch (StatusRuntimeException error) {
log.error("Error while calling product service, cause {}", error.getMessage());
throw new ServiceException(error.getCause());
}
return product;
}
Seems pretty straightforward, but there is one problem. In case of error, on the client-side, you’ll see –
io.grpc.StatusRuntimeException: UNKNOWN
Why do we see
StatusRuntimeException
with status as unknown?gRPC wraps our custom exception
ResourceNotFoundException
inStatusRuntimeException
and swallows the error message and assigns a default status code UNKNOWN.
We can improve error handling by catching ResourceNotFoundException
in the server’s service and call responseObserver.onError(..)
as:
//Server Product Service API
public void getProduct(
GetProductRequest request, StreamObserver<GetProductResponse> responseObserver) {
String productId = request.getProductId();
try {
var product = productRepository.get(productId);
var response =
GetProductResponse.newBuilder()
.setName(product.getName())
.setDescription(product.getDescription())
.setPrice(product.getPrice())
.setUserId(product.getUserId())
.build();
responseObserver.onNext(response);
responseObserver.onCompleted();
} catch (ResourceNotFoundException error) {
log.error("Product id, {} not found", productId);
var status = Status.NOT_FOUND.withDescription(error.getMessage()).withCause(error);
responseObserver.onError(status.asException());
}
log.info("Finished calling Product API service..");
}
On the client-side, you will see:
Error while calling product service, cause NOT_FOUND: Product ID not found
You’ll notice that on the client-side you don’t get the original exception ResourceNotFoundException
thrown by the server, so error.getCause()
on the client is effectively returning null
.
throw new ServiceException(error.getCause()); //error.getCause() is null
Why?
From official documentation of Status withCause(Throwable cause)
, cause is not transmitted from server to client.
Create a derived instance of
grpc-java documentationStatus
with the given cause. However, the cause is not transmitted from server to client.
Passing error metadata using gRPC Metadata
But what if you need to pass some error metadata information back to the client? For example, in our sample application, we may want to pass the id of the Product and standard error message when an error occurs. This can be done by using gRPC Metadata
.
public Product get(String productId) {
var product = Optional.ofNullable(productStorage.get(productId));
return product.orElseThrow(
() ->
new ResourceNotFoundException(
"Product ID not found",
Map.of("resource_id", productId, "message", "Product ID not found")));
}
Fortunately, ResourceNotFoundException
class has an overloaded constructor that takes additional errorMetadata
as, ResourceNotFoundException(String message, Map<String, String> errorMetaData)
.
We can change Product Service API calls by catching ResourceNotFoundException
and calling responseObserver.onError(statusRuntimeException)
with additional metadata as:
public void getProduct(
GetProductRequest request, StreamObserver<GetProductResponse> responseObserver) {
String productId = request.getProductId();
try {
var product = productRepository.get(productId);
var response =
GetProductResponse.newBuilder()
.setName(product.getName())
.setDescription(product.getDescription())
.setPrice(product.getPrice())
.setUserId(product.getUserId())
.build();
responseObserver.onNext(response);
responseObserver.onCompleted();
} catch (ResourceNotFoundException error) {
log.error("Product id, {} not found", productId);
var errorMetaData = error.getErrorMetaData();
var metadata = new Metadata();
errorMetaData.entrySet().stream()
.forEach(
entry ->
metadata.put(
Metadata.Key.of(entry.getKey(), Metadata.ASCII_STRING_MARSHALLER),
entry.getValue()));
var statusRuntimeException =
Status.NOT_FOUND.withDescription(error.getMessage()).asRuntimeException(metadata);
responseObserver.onError(statusRuntimeException);
}
log.info("Finished calling Product API service..");
}
Let’s understand what’s being done here.
- Get error metadata from our custom
ResourceNotFoundException
aserror.getErrorMetaData()
. - For each key-value pair of error-metadata, create a key as
Metadata.Key.of(entry.getKey(), Metadata.ASCII_STRING_MARSHALLER)
. - Store key-value pairs in metadata by calling
metadata.put(Key,Value)
. - Create
StatusRuntimeException
by passing metadata toStatus
. - Call responseObserver to set error condition.
On the client-side, you can catch StatusRuntimeException
and get Metadata from error as:
} catch (StatusRuntimeException error) {
Metadata trailers = error.getTrailers();
Set<String> keys = trailers.keys();
for (String key : keys) {
Metadata.Key<String> k = Metadata.Key.of(key, Metadata.ASCII_STRING_MARSHALLER);
log.info("Received key {}, with value {}", k, trailers.get(k));
}
}
In case of error, the above statement prints:
Received key Key{name='resource_id'}, with value 32c29935-da42-4801-825a-ac410584c281
Received key Key{name='content-type'}, with value application/grpc
Received key Key{name='message'}, with value Product ID not found
As you can see, it’s not clear which metadata is an error related as metadata can contain other information such as content-type (or trace information). For sure, you can define your own convention (for example appending all error metadata keys with err_).
There is another cleaner way to handle error metadata propagation.
Google Richer Error Model
The Google’s google.rpc.Status
provides much richer error handling capabilities. This approach is used by Google APIs, but it’s not part of the official gRPC error model, yet. Internally, this still uses metadata but in a cleaner way. The google.rpc.Status
is defined as:
package google.rpc;
// The `Status` type defines a logical error model that is suitable for
// different programming environments, including REST APIs and RPC APIs.
message Status {
// A simple error code that can be easily handled by the client. The
// actual error code is defined by `google.rpc.Code`.
int32 code = 1;
// A developer-facing human-readable error message in English. It should
// both explain the error and offer an actionable resolution to it.
string message = 2;
// Additional error information that the client code can use to handle
// the error, such as retry info or a help link.
repeated google.protobuf.Any details = 3;
}
You must be aware of the gotcha associated with this approach, mainly it’s not supported by all language libraries and implementation may not be consistent across language.
The richness of error handling comes from ‘repeated google.protobuf.Any
‘. From documentation –
`Any` contains an arbitrary serialized protocol buffer message along with a URL that describes the type of the serialized message.
You can use Any
to pack your arbitrary custom error models or use any of the predefined error_details.proto. Let’s see both of the approaches.
Using custom error model
Define your own custom error model as:
message ErrorDetail {
// Error code
string errorCode = 1;
//Error message
string message = 2;
// Additional metadata associated with the Error
map<string, string> metadata = 3;
}
On the server-side Product Service, build the ErrorInfo
model and add to com.google.rpc.Status
by calling .addDetails(Any.pack(errorStatus))
as:
//Catch Block
} catch (ResourceNotFoundException error) {
log.error("Product id, {} not found", productId);
var errorMetaData = error.getErrorMetaData();
Resources.ErrorDetail errorInfo =
Resources.ErrorDetail.newBuilder()
.setErrorCode("ResourceNotFound")
.setMessage(error.getMessage())
.putAllMetadata(errorMetaData)
.build();
com.google.rpc.Status status =
com.google.rpc.Status.newBuilder()
.setCode(Code.NOT_FOUND.getNumber())
.setMessage("Product id not found")
.addDetails(Any.pack(errorInfo))
.build();
responseObserver.onError(StatusProto.toStatusRuntimeException(status));
}
And, on the client-side Product Gateway Service, change catch block as:
//Catch Block
} catch (StatusRuntimeException error) {
com.google.rpc.Status status = io.grpc.protobuf.StatusProto.fromThrowable(error);
Resources.ErrorDetail errorInfo = null;
for (Any any : status.getDetailsList()) {
if (!any.is(Resources.ErrorDetail.class)) {
continue;
}
errorInfo = any.unpack(Resources.ErrorDetail.class);
}
log.info(" Error while calling product service, reason {} ", errorInfo.getMessage());
throw new ServiceException(errorInfo.getMessage(), errorInfo.getMetadataMap());
}
Using pre-defined error model
Rather than defining your own error model, you can use predefined error models from error_details.proto. For example, you can use ErrorInfo
defined as:
message ErrorInfo {
// The reason of the error. This is a constant value that identifies the
// proximate cause of the error. Error reasons are unique within a particular
// domain of errors. This should be at most 63 characters and match
// /[A-Z0-9_]+/.
string reason = 1;
// The logical grouping to which the "reason" belongs. The error domain
// is typically the registered service name of the tool or product that
// generates the error. Example: "pubsub.googleapis.com". If the error is
// generated by some common infrastructure, the error domain must be a
// globally unique value that identifies the infrastructure. For Google API
// infrastructure, the error domain is "googleapis.com".
string domain = 2;
// Additional structured details about this error.
// Keys should match /[a-zA-Z0-9-_]/ and be limited to 64 characters in
// length. When identifying the current value of an exceeded limit, the units
// should be contained in the key, not the value. For example, rather than
// {"instanceLimit": "100/request"}, should be returned as,
// {"instanceLimitPerRequest": "100"}, if the client exceeds the number of
// instances that can be created in a single (batch) request.
map<string, string> metadata = 3;
}
On Server side Product Service, you can use com.google.rpc.ErrorInfo
as:
} catch (ResourceNotFoundException error) {
var errorMetaData = error.getErrorMetaData();
ErrorInfo errorInfo =
ErrorInfo.newBuilder()
.setReason("Resource not found")
.setDomain("Product")
.putAllMetadata(errorMetaData)
.build();
com.google.rpc.Status status =
com.google.rpc.Status.newBuilder()
.setCode(Code.NOT_FOUND.getNumber())
.setMessage("Product id not found")
.addDetails(Any.pack(errorInfo))
.build();
responseObserver.onError(StatusProto.toStatusRuntimeException(status));
}
The only change in the client-side is to user compiled ErrorInfo
class as:
//Catch Block
} catch (StatusRuntimeException error) {
com.google.rpc.Status status = io.grpc.protobuf.StatusProto.fromThrowable(error);
ErrorInfo errorInfo = null;
for (Any any : status.getDetailsList()) {
if (!any.is(ErrorInfo.class)) {
continue;
}
errorInfo = any.unpack(ErrorInfo.class);
}
log.info(" Error while calling product service, reason {} ", errorInfo.getReason());
throw new ServiceException(errorInfo.getReason(), errorInfo.getMetadataMap());
}
Global Interceptor for error handling
The approach of catching and throwing exceptions in the server-side Product Service can quickly get very complex and clumsy. In the case of complex business logic, you may end up with code like catch (ResourceNotFoundException | ServiceException | OtherException error)
.
We can simplify this by using a gRPC interceptor. The interceptor catches such exceptions and processes them accordingly as:
public class GlobalExceptionHandlerInterceptor implements ServerInterceptor {
@Override
public <T, R> ServerCall.Listener<T> interceptCall(
ServerCall<T, R> serverCall, Metadata headers, ServerCallHandler<T, R> serverCallHandler) {
ServerCall.Listener<T> delegate = serverCallHandler.startCall(serverCall, headers);
return new ExceptionHandler<>(delegate, serverCall, headers);
}
private static class ExceptionHandler<T, R>
extends ForwardingServerCallListener.SimpleForwardingServerCallListener<T> {
private final ServerCall<T, R> delegate;
private final Metadata headers;
ExceptionHandler(
ServerCall.Listener<T> listener, ServerCall<T, R> serverCall, Metadata headers) {
super(listener);
this.delegate = serverCall;
this.headers = headers;
}
@Override
public void onHalfClose() {
try {
super.onHalfClose();
} catch (RuntimeException ex) {
handleException(ex, delegate, headers);
throw ex;
}
}
private void handleException(
RuntimeException exception, ServerCall<T, R> serverCall, Metadata headers) {
// Catch specific Exception and Process
if (exception instanceof ResourceNotFoundException) {
var errorMetaData = ((ResourceNotFoundException) exception).getErrorMetaData();
// Build google.rpc.ErrorInfo
var errorInfo =
ErrorInfo.newBuilder()
.setReason("Resource not found")
.setDomain("Product")
.putAllMetadata(errorMetaData)
.build();
com.google.rpc.Status rpcStatus =
com.google.rpc.Status.newBuilder()
.setCode(Code.NOT_FOUND.getNumber())
.setMessage("Product id not found")
.addDetails(Any.pack(errorInfo))
.build();
var statusRuntimeException = StatusProto.toStatusRuntimeException(rpcStatus);
var newStatus = Status.fromThrowable(statusRuntimeException);
// Get metadata from statusRuntimeException
Metadata newHeaders = statusRuntimeException.getTrailers();
serverCall.close(newStatus, newHeaders);
} else {
serverCall.close(Status.UNKNOWN, headers);
}
}
}
}
Let’s understand what’s being done here –
- First, create
ExcepltionHandler
, which overridesonHalfClose()
, by extending fromForwardingServerCallListener.SimpleForwardingServerCallListener<T>
. - The
handleException(..)
method first buildsgoogle.rpc.ErrorInfo
and then addsErrorInfo
tocom.google.rpc.Status
, which internally builds the new metadata containingErrorInfo
. - As
serverCall.close(status, newHeaders)
, takesio.grpc.Status
we need to convertcom.google.rpc.Status
by callingStatus.fromThrowable(statusRuntimeException)
- Then all we need to do is call
serverCall.close(status, newHeaders)
withio.grpc.Status
and newmetadata
.
The only change needed on the server-side service implementation of Product Service API is to remove catch block and exception processing logic as:
public void getProduct(
GetProductRequest request, StreamObserver<GetProductResponse> responseObserver) {
String productId = request.getProductId();
var product = productRepository.get(productId);
var response =
GetProductResponse.newBuilder()
.setName(product.getName())
.setDescription(product.getDescription())
.setPrice(product.getPrice())
.setUserId(product.getUserId())
.build();
responseObserver.onNext(response);
responseObserver.onCompleted();
}
On the client-side, there is no change i.e. we can get an instance of ErrorInfo
class as errorInfo = any.unpack(ErrorInfo.class)
.
Using Spring Interceptor
If you can use grpc-spring-boot-starter then this greatly simplifies everything. All you need to do is to create a class and annotate that class with @GrpcAdvice
and provide methods to handle the individual exception as:
@GrpcAdvice
public class ExceptionHandler {
@GrpcExceptionHandler(ResourceNotFoundException.class)
public StatusRuntimeException handleResourceNotFoundException(ResourceNotFoundException cause) {
var errorMetaData = cause.getErrorMetaData();
var errorInfo =
ErrorInfo.newBuilder()
.setReason("Resource not found")
.setDomain("Product")
.putAllMetadata(errorMetaData)
.build();
var status =
com.google.rpc.Status.newBuilder()
.setCode(Code.NOT_FOUND.getNumber())
.setMessage("Resource not found")
.addDetails(Any.pack(errorInfo))
.build();
return StatusProto.toStatusRuntimeException(status);
}
}
This approach is similar to Spring error handling. You just need to define a method with annotation @GrpcExceptionHandler
, for example @GrpcExceptionHandler(ResourceNotFoundException.class)
, for the specific error condition. That’s it, no other change is needed on the server-side.
Summary
Getting error handling right can be very tricky in gRPC. Officially, gRPC heavily relies on status codes and metadata to handle errors. We can use gRPC metadata to pass additional error metadata from server application to client application. The Google’s google.rpc.Status
provides much richer error handling capabilities but it’s not fully supported in all the languages. It’s possible to define a global gRPC interceptor to handle all error conditions centrally. The spring boot wrapper library yidongnan/grpc-spring-boot-starter provides a much cleaner approach to handle the error.
Discussion about this post