|
Length: Two hours
Registration fee: $ (plus tax where applicable)
Language: English
Exam format: 50-60 multiple choice and multiple select questions
Exam delivery method:
a. Take the online-proctored exam from a remote location, review the online
testing requirements.
b. Take the onsite-proctored exam at a testing center, locate a test center near
you
Prerequisites: None
Recommended experience: 3+ years of industry experience including 1 or more
years designing and managing solutions using Google Cloud.
Certification Renewal / Recertification: Candidates must recertify in order to
maintain their certification status. Unless explicitly stated in the detailed
exam descriptions, all Google Cloud certifications are valid for two years from
the date of certification. Recertification is accomplished by retaking the exam
during the recertification eligibility time period and achieving a passing
score. You may attempt recertification starting 60 days prior to your
certification expiration date.
Exam overview
Step 1: Get real world experience
Before attempting the Machine Learning Engineer exam, it's recommended that you
have 3+ years of hands-on experience with Google Cloud products and solutions.
Ready to start building? Explore the Google Cloud Free Tier for free usage (up
to monthly limits) of select products.
Try the Google Cloud Free Tier
Step 2: Understand what's on the exam
The exam guide contains a complete list of topics that may be included on
the exam. Review the exam guide to determine if your skills align with the
topics on the exam.
See current exam guide
Step 3: Review the sample questions
Familiarize yourself with the format of questions and example content that
may be covered on the Machine Learning Engineer exam.
Review sample questions
Step 4: Round out your skills with training
Prepare for the exam by following the Machine Learning Engineer learning path.
Explore online training, in-person classes, hands-on labs, and other resources
from Google Cloud.
Start preparing
Prepare for the exam with Googlers and certified experts. Get valuable exam tips
and tricks, as well as insights from industry experts.
Explore Google Cloud documentation for in-depth discussions on the concepts and
critical components of Google Cloud.
Learn about designing, training, building, deploying, and operationalizing
secure ML applications on Google Cloud using the Official Google Cloud Certified
Professional Machine Learning Engineer Study Guide. This guide uses real-world
scenarios to demonstrate how to use the Vertex AI platform and technologies such
as TensorFlow, Kubeflow, and AutoML, as well as best practices on when to choose
a pretrained or a custom model.
Step 5: Schedule an exam
Register and select the option to take the exam remotely or at a nearby testing
center.
Review exam terms and conditions and data sharing policies.
A Professional Machine Learning Engineer builds, evaluates, productionizes, and
optimizes ML models by using Google Cloud technologies and knowledge of proven
models and techniques. The ML Engineer handles large, complex datasets and
creates repeatable, reusable code. The ML Engineer considers responsible AI and
fairness throughout the ML model development process, and collaborates closely
with other job roles to ensure long-term success of ML-based applications. The
ML Engineer has strong programming skills and experience with data platforms and
distributed data processing tools. The ML Engineer is proficient in the areas of
model architecture, data and ML pipeline creation, and metrics interpretation.
The ML Engineer is familiar with foundational concepts of MLOps, application
development, infrastructure management, data engineering, and data governance.
The ML Engineer makes ML accessible and enables teams across the organization.
By training, retraining, deploying, scheduling, monitoring, and improving
models, the ML Engineer designs and creates scalable, performant solutions.
* Note: The exam does not directly assess coding skill. If you have a minimum
proficiency in Python and Cloud SQL, you should be able to interpret any
questions with code snippets.
The Professional Machine Learning Engineer exam assesses your ability to:
Architect low-code ML solutions
Collaborate within and across teams to manage data and models
Scale prototypes into ML models
Serve and scale models
Automate and orchestrate ML pipelines
Monitor ML solutions
Google-Professional-Machine-Learning-Engineer Brain Dumps Exam + Online / Offline and Android Testing Engine & 4500+ other exams included
$50 - $25 (you save $25)
Buy Now
QUESTION 1
As the lead ML Engineer for your company, you are responsible for building
ML models to digitize
scanned customer forms. You have developed a TensorFlow model that converts the
scanned images
into text and stores them in Cloud Storage. You need to use your ML model on the
aggregated data
collected at the end of each day with minimal manual intervention. What should
you do?
A. Use the batch prediction functionality of Al Platform
B. Create a serving pipeline in Compute Engine for prediction
C. Use Cloud Functions for prediction each time a new data point is ingested
D. Deploy the model on Al Platform and create a version of it for online
inference.
Answer: A
Explanation:
Batch prediction is the process of using an ML model to make predictions on a
large set of data
points. Batch prediction is suitable for scenarios where the predictions are not
time-sensitive and can
be done in batches, such as digitizing scanned customer forms at the end of each
day. Batch
prediction can also handle large volumes of data and scale up or down the
resources as needed. AI
Platform provides a batch prediction service that allows users to submit a job
with their TensorFlow
model and input data stored in Cloud Storage, and receive the output predictions
in Cloud Storage as
well. This service requires minimal manual intervention and can be automated
with Cloud Scheduler
or Cloud Functions. Therefore, using the batch prediction functionality of AI
Platform is the best
option for this use case.
Reference:
Batch prediction overview
Using batch prediction
QUESTION 2
You work for a global footwear retailer and need to predict when an item
will be out of stock based
on historical inventory data. Customer behavior is highly dynamic since footwear
demand is influenced by many different
factors. You want to serve models that are trained on all available data, but
track your performance
on specific subsets of data before pushing to production. What is the most
streamlined and reliable
way to perform this validation?
A. Use the TFX ModelValidator tools to specify performance metrics for
production readiness
B. Use k-fold cross-validation as a validation strategy to ensure that your
model is ready forproduction.
C. Use the last relevant week of data as a validation set to ensure that your
model is performingaccurately on current data
D. Use the entire dataset and treat the area under the receiver operating
characteristics curve (AUC ROC) as the main metric.
Answer: A
Explanation:
TFX ModelValidator is a tool that allows you to compare new models against a
baseline model and
evaluate their performance on different metrics and data slices1. You can use
this tool to validate
your models before deploying them to production and ensure that they meet your
expectations and requirements.
k-fold cross-validation is a technique that splits the data into k subsets and
trains the model on k-1
subsets while testing it on the remaining subset. This is repeated k times and
the average
performance is reported2. This technique is useful for estimating the
generalization error of a model,
but it does not account for the dynamic nature of customer behavior or the
potential changes in data distribution over time.
Using the last relevant week of data as a validation set is a simple way to
check the models
performance on recent data, but it may not be representative of the entire data
or capture the longterm
trends and patterns. It also does not allow you to compare the model with a
baseline or evaluate it on different data slices.
Using the entire dataset and treating the AUC ROC as the main metric is not a
good practice because
it does not leave any data for validation or testing. It also assumes that the
AUC ROC is the only
metric that matters, which may not be true for your business problem. You may
want to consider
other metrics such as precision, recall, or revenue.
QUESTION 3
You work on a growing team of more than 50 data scientists who all use Al
Platform. You are
designing a strategy to organize your jobs, models, and versions in a clean and
scalable way. Which strategy should you choose?
A. Set up restrictive I AM permissions on the Al Platform notebooks so that only
a single user or group can access a given instance.
B. Separate each data scientist's work into a different project to ensure that
the jobs, models, and versions created by each data scientist are accessible
only to that user.
C. Use labels to organize resources into descriptive categories. Apply a label
to each created resource so that users can filter the results by label when
viewing or monitoring the resources
D. Set up a BigQuery sink for Cloud Logging logs that is appropriately filtered
to capture information about Al Platform resource usage In BigQuery create a SQL
view that maps users to the resources they are using.
Answer: C
Explanation:
Labels are key-value pairs that can be attached to any AI Platform resource,
such as jobs, models,
versions, or endpoints1. Labels can help you organize your resources into
descriptive categories, such
as project, team, environment, or purpose. You can use labels to filter the
results when you list or
monitor your resources, or to group them for billing or quota purposes2. Using
labels is a simple and
scalable way to manage your AI Platform resources without creating unnecessary
complexity or overhead.
Therefore, using labels to organize resources is the best strategy for this use
case.
Reference:
Using labels
Filtering and grouping by labels
QUESTION 4
During batch training of a neural network, you notice that there is an
oscillation in the loss. How should you adjust your model to ensure that it
converges?
A. Increase the size of the training batch
B. Decrease the size of the training batch
C. Increase the learning rate hyperparameter
D. Decrease the learning rate hyperparameter
Answer: D
Explanation:
Oscillation in the loss during batch training of a neural network means that the
model is
overshooting the optimal point of the loss function and bouncing back and forth.
This can prevent
the model from converging to the minimum loss value. One of the main reasons for
this
phenomenon is that the learning rate hyperparameter, which controls the size of
the steps that the
model takes along the gradient, is too high. Therefore, decreasing the learning
rate hyperparameter
can help the model take smaller and more precise steps and avoid oscillation.
This is a common
technique to improve the stability and performance of neural network training12.
Reference:
Interpreting Loss Curves
Is learning rate the only reason for training loss oscillation after few epochs?
QUESTION 5
You are building a linear model with over 100 input features, all with
values between -1 and 1.
You suspect that many features are non-informative. You want to remove the
non-informative features
from your model while keeping the informative ones in their original form. Which
technique should you use?
A. Use Principal Component Analysis to eliminate the least informative features.
B. Use L1 regularization to reduce the coefficients of uninformative features to
0.
C. After building your model, use Shapley values to determine which features are
the most informative.
D. Use an iterative dropout technique to identify which features do not degrade
the model when removed.
Answer: B
Explanation:
L1 regularization, also known as Lasso regularization, adds the sum of the
absolute values of the
models coefficients to the loss function1. It encourages sparsity in the model
by shrinking some
coefficients to precisely zero2. This way, L1 regularization can perform feature
selection and remove
the non-informative features from the model while keeping the informative ones
in their original
form. Therefore, using L1 regularization is the best technique for this use
case.
Reference:
Regularization in Machine Learning - GeeksforGeeks
Regularization in Machine Learning (with Code Examples) - Dataquest
L1 And L2 Regularization Explained & Practical How To Examples
L1 and L2 as Regularization for a Linear Model
QUESTION 6
Your team has been tasked with creating an ML solution in Google Cloud to
classify support requests
for one of your platforms. You analyzed the requirements and decided to use
TensorFlow to build the
classifier so that you have full control of the model's code, serving, and
deployment. You will use
Kubeflow pipelines for the ML platform. To save time, you want to build on
existing resources and
use managed services instead of building a completely new model. How should you
build the classifier?
A. Use the Natural Language API to classify support requests
B. Use AutoML Natural Language to build the support requests classifier
C. Use an established text classification model on Al Platform to perform
transfer learning
D. Use an established text classification model on Al Platform as-is to classify
support requests
Answer: C
Explanation:
Transfer learning is a technique that leverages the knowledge and weights of a
pre-trained model
and adapts them to a new task or domain1. Transfer learning can save time and
resources by
avoiding training a model from scratch, and can also improve the performance and
generalization of
the model by using a larger and more diverse dataset2. AI Platform provides
several established text
classification models that can be used for transfer learning, such as BERT,
ALBERT, or XLNet3. These
models are based on state-of-the-art natural language processing techniques and
can handle various
text classification tasks, such as sentiment analysis, topic classification, or
spam detection4. By using
one of these models on AI Platform, you can customize the models code, serving,
and deployment,
and use Kubeflow pipelines for the ML platform. Therefore, using an established
text classification
model on AI Platform to perform transfer learning is the best option for this
use case.
Reference:
Transfer Learning - Machine Learnings Next Frontier