Computer Science homework help. Final term logistic

• Same day as lecture day

• Exam will only be available from 12:00am to 11:59pm on the exam day

• When exam starts, have to finish the exam in one attempt

• Format: Online

Final-term topics

• First-order Logic and Inference in First Order Logic

• Markov Decision Process

• Reinforcement Learning

• Learning: Supervised learning, Linear Regression, Logistic regression, Neural

Networks

• Deep Learning: DNN, CNN, RNN, LSTM

Textbook

• Artificial Intelligence – A Modern Approach, Third Edition, Stuart J. Russell and Peter Norvig

Final-term topics

• First-order Logic+ Inference in First Order Logic: Lecture Slides

+ Chapter 8 + Chapter 9

• Markov Decision Process: Lecture Slides+ Chapter 17

• Reinforcement Learning: Lecture Slides+ Chapter 21

• Learning: Lecture Slides + Chapter 18

• Deep Learning: Lecture Slides

Sample Question

• Suppose discount factor γ= 0.5. Calculate utility values for all the grid

position using Value Iteration for time t= 0 to time t=3

Sample Question

Write the Bellman equation for state (1, 2)

Sample Question

• Suppose i

is the policy shown in right fig:

– Then we have i

(1, 1)=Up, i

(1, 2)=Up

– Write the simplified Bellman equations for Ui

(1, 2)

Sample Question- True/False

• In an MDP, the larger the discount factor, the more strongly favored are shortterm rewards over long-term rewards. True or False. If false, correct the

statement

• The utility U∗ of the optimal policy π∗ must satisfy a set of equations

called the Markov conditions. True or False. If false, correct the statement

• MDP instances with small discount factors tend to emphasize near-term

rewards.

• For reinforcement learning, we need to know the transition probabilities

between states before we start

Sample Question

• For each of the below applications/scenarios described below, indicate

which technology is best suited: RL or MDP

• You are playing a game of Tic Tac Toe against a random opponent.

You can see the board and choose actions, but your opponent

choose random actions.

Sample Question

• The law says that it is a crime for an American to sell weapons to hostile

nations. The country Nono, an enemy of America, has some missiles, and all

of its missiles were sold to it by Colonel West, who is American.

• Prove that Col. West is a criminal

• Prove by Forward Chaining and/or Backward Chaining

Sample Question

• What are the choices of weight vector [w0, w1, w2] that can classify y as y= x1 XOR x2?

Sample Question

• Consider a Convolutional Neural Network (CNN) that has an Input layer containing a 13 x 13

image that is connected to a Convolution layer using a 4 x 4 filter and a stride of 1 (i.e., the

filter is shifted horizontally and vertically by 1 pixel, and only filters that are entirely inside the

input array are connected to a unit in the Convolution layer). There is no activation function

associated with the units in the Convolution layer. The Convolution layer is connected to a

max Pooling layer using a 2 x 2 filter and a stride of 2. (Only filters that are entirely inside the

array in the Convolution layer are connected to a unit in the Pooling layer.) The Output layer

contains 4 units that each use an ReLU activation function and these units are fullyconnected to the units in the Pooling layer.

Q1: How many units are in the Convolution layer?

Q2: How many distinct weights must be learned for the connections to the

Convolution layer?

Q3: How many units are in the Pooling layer?

Short Questions

• Describe the Bellman equation using utility relationship between states

• Describe temporal difference learning. Why is it called “temporal difference”

learning?

• What is Q-learning? What are the major differences between Q-learning and

SARSA learning?

• Mention the differences between MDP and Reinforcement Learning

• Describe forward chaining with example

• Describe backward chaining with example

• What is knowledge base. Explain with an example

Short Questions

• Can you resolve the following two sentences using general unifier? If so,

what sentence results?

• Write first-order-logic sentence:

– A sibling is another child of one’s parents

• Person(John) [read “John is a person”]

– Is it a predicate symbol, function symbol or constant symbol

• What is the difference between first-order logic and propositional logic?

Short Questions

• Describe backpropagation technique of Neural Networks

• What are the hyperparameters for training a neural network. Describe the

hyperparameters

How to prepare?

• Review class materials

• make sure that you understand the concepts