|
|
| (6 intermediate revisions by the same user not shown) |
| Line 3: |
Line 3: |
| Adversarial neural networks use an architecture consisting of two separate neural networks - one network attempts to learn how to accomplish a task, and another network attempts to differentiate between the output of the first network and the "real" output. | | Adversarial neural networks use an architecture consisting of two separate neural networks - one network attempts to learn how to accomplish a task, and another network attempts to differentiate between the output of the first network and the "real" output. |
|
| |
|
| =TensorFlow Examples of Adversarial Neural Networks= | | =TensorFlow Adversarial Examples= |
|
| |
|
| ==Adversarial Crypto== | | ==Adversarial Crypto== |
| Line 9: |
Line 9: |
| This adversarial crypto neural network attempts to learn how to protect communications using the adversarial architecture. | | This adversarial crypto neural network attempts to learn how to protect communications using the adversarial architecture. |
|
| |
|
| Link to paper: "Learning to Protect Communications with Adversarial Neural Cryptography": https://arxiv.org/abs/1610.06918
| | Paper: "Learning to Protect Communications with Adversarial Neural Cryptography" |
| | |
| | Link to paper: https://arxiv.org/abs/1610.06918 |
|
| |
|
| Link to code: https://github.com/tensorflow/models/tree/master/research/adversarial_crypto | | Link to code: https://github.com/tensorflow/models/tree/master/research/adversarial_crypto |
| Line 27: |
Line 29: |
| ===The Model=== | | ===The Model=== |
|
| |
|
| We'll step through the code line-by-line again. Here's the link to the code: https://github.com/tensorflow/models/blob/master/research/adversarial_crypto/train_eval.py | | We'll step through the code line-by-line. Here's the link to the code: https://github.com/tensorflow/models/blob/master/research/adversarial_crypto/train_eval.py |
|
| |
|
| ====License====
| | Full model walkthrough is on the [[TensorFlow/Adversarial Crypto]] page. |
|
| |
|
| Obligatory license info:
| | The rundown is: |
| | * Create an AdversarialCrypto class that holds a training optimizer object for the Bob and Alice networks |
| | * Define a method that evaluates the networks as-is and prints the percent losses |
| | * Define a method that trains the network for a specified number of iterations, stopping early if the network reaches its target losses |
| | * Define a method that calls the training function (above), then re-trains Eve several more times from scratch |
|
| |
|
| <pre>
| | ==Adversarial Text== |
| # Copyright 2016 The TensorFlow Authors All Rights Reserved.
| |
| #
| |
| # Licensed under the Apache License, Version 2.0 (the "License");
| |
| # you may not use this file except in compliance with the License.
| |
| # You may obtain a copy of the License at
| |
| #
| |
| # http://www.apache.org/licenses/LICENSE-2.0
| |
| #
| |
| # Unless required by applicable law or agreed to in writing, software
| |
| # distributed under the License is distributed on an "AS IS" BASIS,
| |
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
| |
| # See the License for the specific language governing permissions and
| |
| # limitations under the License.
| |
| # ==============================================================================
| |
| </pre>
| |
| | |
| Some info about the network:
| |
| * There are actually 3 neural networks involved: Alice, Bob, and Eve
| |
| * Alice takes inputs in_m (message), in_k (key) and outputs the ciphertext
| |
| * Bob takes inputs in_k (key), ciphertext and attempts to output the plaintext
| |
| * Even takes input ciphertext (no key) and also attempts to output the plaintext
| |
| | |
| The file starts with imports/declarations to be compatible with Python 3:
| |
| | |
| <pre>
| |
| # TensorFlow Python 3 compatibility
| |
| from __future__ import absolute_import
| |
| from __future__ import division
| |
| from __future__ import print_function
| |
| import signal
| |
| import sys
| |
| from six.moves import xrange # pylint: disable=redefined-builtin
| |
| import tensorflow as tf
| |
| </pre>
| |
| | |
| ====Input Argument Flags and Parameters====
| |
| | |
| Hyperparameter flags can be set on the command line:
| |
| | |
| <pre>
| |
| flags = tf.app.flags
| |
| flags.DEFINE_float('learning_rate', 0.0008, 'Constant learning rate')
| |
| flags.DEFINE_integer('batch_size', 4096, 'Batch size')
| |
| FLAGS = flags.FLAGS
| |
| </pre>
| |
| | |
| The FLAGS stuff does not seem to be defined anywhere in the documentation, so the usage is not clear here. But, as an author on TF project states [https://stackoverflow.com/questions/33932901/whats-the-purpose-of-tf-app-flags-in-tensorflow#33938519 here], it is intended to make demos more convenient, and essentially wraps argparse.
| |
| | |
| Also see [[TensorFlow/Command Line Args]].
| |
| | |
| More parameter definitions follow:
| |
| | |
| <pre>
| |
| # Input and output configuration.
| |
| TEXT_SIZE = 16
| |
| KEY_SIZE = 16
| |
| | |
| # Training parameters.
| |
| ITERS_PER_ACTOR = 1
| |
| EVE_MULTIPLIER = 2 # Train Eve 2x for every step of Alice/Bob
| |
| # Train until either max loops or Alice/Bob "good enough":
| |
| MAX_TRAINING_LOOPS = 850000
| |
| BOB_LOSS_THRESH = 0.02 # Exit when Bob loss < 0.02 and Eve > 7.7 bits
| |
| EVE_LOSS_THRESH = 7.7
| |
| | |
| # Logging and evaluation.
| |
| PRINT_EVERY = 200 # In training, log every 200 steps.
| |
| EVE_EXTRA_ROUNDS = 2000 # At end, train eve a bit more.
| |
| RETRAIN_EVE_ITERS = 10000 # Retrain eve up to ITERS*LOOPS times.
| |
| RETRAIN_EVE_LOOPS = 25 # With an evaluation each loop
| |
| NUMBER_OF_EVE_RESETS = 5 # And do this up to 5 times with a fresh eve.
| |
| # Use EVAL_BATCHES samples each time we check accuracy.
| |
| EVAL_BATCHES = 1
| |
| </pre>
| |
| | |
| | |
| ====Batch of Random Booleans====
| |
| | |
| This is a method to define an array of random booleans - this is used to create the message that Alice encrypts, and to define the key that Alice and Bob use to decrypt the message.
| |
| | |
| ====AdversarialCrypto Class====
| |
| | |
| The Adversarial Crypto class defines the set of three neural networks used to do the adversarial network. As part of the training and evaluation process <code>train_and_evaluate()</code>, an instance of this class is created and passed to the evaluation function <code>doeval()</code> in the main body of the code.
| |
| | |
| What does this class do?
| |
| * Creates the three networks for Alice, Bob, and Eve
| |
| * Creates connections from Alice to Bob and Alice to Eve to pass the correct info to the correct networks
| |
| * Defines the loss function for Eve and for Bob
| |
| * Defines the optimizers that the networks should use
| |
| * Manages the state of each network (i.e., allows you to reset the Eve network)
| |
| | |
| <pre>
| |
| class AdversarialCrypto(object):
| |
| """Primary model implementation class for Adversarial Neural Crypto.
| |
| This class contains the code for the model itself,
| |
| and when created, plumbs the pathways from Alice to Bob and
| |
| Eve, creates the optimizers and loss functions, etc.
| |
|
| |
| Attributes:
| |
| eve_loss: Eve's loss function.
| |
| bob_loss: Bob's loss function. Different units from eve_loss.
| |
| eve_optimizer: A tf op that runs Eve's optimizer.
| |
| bob_optimizer: A tf op that runs Bob's optimizer.
| |
| bob_reconstruction_loss: Bob's message reconstruction loss,
| |
| which is comparable to eve_loss.
| |
| reset_eve_vars: Execute this op to completely reset Eve.
| |
| """
| |
| </pre>
| |
| | |
| What does the constructor do?
| |
| * The constructor creates the Alice, Bob, and Eve model by calling the model() method with the right parameters
| |
| * Creates the optimizer for Bob and for Eve
| |
| * Sets up the loss function for Eve, based on <code>tf.reduce_sum()</code> and <code>optimizer.minimize()</code>
| |
| * Sets up the loss function for Bob, based on <code>tf.reduce_sum()</code>
| |
| | |
| <pre>
| |
| def __init__(self):
| |
| in_m, in_k = self.get_message_and_key()
| |
| encrypted = self.model('alice', in_m, in_k)
| |
| decrypted = self.model('bob', encrypted, in_k)
| |
| eve_out = self.model('eve', encrypted, None)
| |
| | |
| self.reset_eve_vars = tf.group(
| |
| *[w.initializer for w in tf.get_collection('eve')])
| |
| | |
| optimizer = tf.train.AdamOptimizer(learning_rate=FLAGS.learning_rate)
| |
|
| |
|
| # Eve's goal is to decrypt the entire message:
| | This trains a neural network model to detect the sentiment in IMDB text. This illustrates semi-supervised learning. |
| eve_bits_wrong = tf.reduce_sum(
| |
| tf.abs((eve_out + 1.0) / 2.0 - (in_m + 1.0) / 2.0), [1])
| |
| self.eve_loss = tf.reduce_sum(eve_bits_wrong)
| |
| self.eve_optimizer = optimizer.minimize(
| |
| self.eve_loss, var_list=tf.get_collection('eve'))
| |
|
| |
|
| # Alice and Bob want to be accurate...
| | Link to code: https://github.com/tensorflow/models/tree/master/research/adversarial_text |
| self.bob_bits_wrong = tf.reduce_sum(
| |
| tf.abs((decrypted + 1.0) / 2.0 - (in_m + 1.0) / 2.0), [1])
| |
| # ... and to not let Eve do better than guessing.
| |
| self.bob_reconstruction_loss = tf.reduce_sum(self.bob_bits_wrong)
| |
| bob_eve_error_deviation = tf.abs(float(TEXT_SIZE) / 2.0 - eve_bits_wrong)
| |
| # 7-9 bits wrong is OK too, so we squish the error function a bit.
| |
| # Without doing this, we often tend to hang out at 0.25 / 7.5 error,
| |
| # and it seems bad to have continued, high communication error.
| |
| bob_eve_loss = tf.reduce_sum(
| |
| tf.square(bob_eve_error_deviation) / (TEXT_SIZE / 2)**2)
| |
|
| |
|
| # Rescale the losses to [0, 1] per example and combine.
| | ==Running== |
| self.bob_loss = (self.bob_reconstruction_loss / TEXT_SIZE + bob_eve_loss)
| |
|
| |
|
| self.bob_optimizer = optimizer.minimize(
| | Running this model is slightly more complicated than running the adversarial crypto network. |
| self.bob_loss,
| |
| var_list=(tf.get_collection('alice') + tf.get_collection('bob')))
| |
| </pre>
| |
|
| |
|
| ====AdversarialCrypto Class - Creation of Neural Network Model====
| | The adversarial text network steps are as follows: |
| | * fetch data |
| | * generate vocab |
| | * generate training/validation/test data |
| | * pretrain language model |
| | * train classifier |
| | * evaluate classifier on test data |
|
| |
|
| Now, the actual creation of the models for Alice, Bob, and Eve happens in the call to <code>model()</code>. What happens with the method header?
| | ===Get Vocabulary Data=== |
| * We pass in the name of the graph component ('alice', 'bob', or 'eve') to add new model components to
| |
| * We pass in the input message (either the plain text, to Alice, or the ciphertext, to Bob and Eve)
| |
| * We pass in the key (optional); if no key is passed in, the input to the neural network is just the message
| |
|
| |
|
| Here's the model method definition:
| | Start by obtaining the data, which is an 80 MB tar file, and decompress it: |
|
| |
|
| <pre> | | <pre> |
| def model(self, collection, message, key=None):
| | $ wget http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz -O /tmp/imdb.tar.gz |
| """The model for Alice, Bob, and Eve. If key=None, the first FC layer
| |
| takes only the message as inputs. Otherwise, it uses both the key
| |
| and the message.
| |
| Args:
| |
| collection: The graph keys collection to add new vars to.
| |
| message: The input message to process.
| |
| key: The input key (if any) to use.
| |
| """
| |
|
| |
|
| if key is not None:
| | $ tar -xf /tmp/imdb.tar.gz -C /tmp |
| combined_message = tf.concat(axis=1, values=[message, key])
| |
| else:
| |
| combined_message = message
| |
|
| |
|
| | $ du -hs /tmp/aclImdb |
| | 487M /tmp/aclImdb |
| </pre> | | </pre> |
|
| |
|
| If we pass in both a message and a key, we concatenate the inputs using <code>tf.concat()</code>. Otherwise, the only input is the message.
| | ===Build the Vocabulary=== |
|
| |
|
| The next step is to call <code>tf.contrib.framework.arg_scope()</code>. The [https://www.tensorflow.org/api_docs/python/tf/contrib/framework/arg_scope documentation] for this function will loop over each TensorFlow model graph passed to it, and add a <code>@add_arg_scope</code> decorator to it.
| | Use a Bazel job to build the vocabulary from the data: |
| | |
| In other words, every time we have a fully_connected layer and a conv2d layer, we set them up to be on the specified graph (Alice, Bob, or Eve):
| |
|
| |
|
| <pre> | | <pre> |
| # Ensure that all variables created are in the specified collection.
| | $ IMDB_DATA_DIR=/tmp/imdb |
| with tf.contrib.framework.arg_scope(
| |
| [tf.contrib.layers.fully_connected, tf.contrib.layers.conv2d],
| |
| variables_collections=[collection]):
| |
| </pre>
| |
|
| |
|
| Next, we create a fully connected neural network layer. We pass in the message (and optionally the key), give the layer a size (the text length, and optionally the key length), we initialize the bias of the fully-connected layer as all-zero, and do not set an activation function:
| | $ bazel run data:gen_vocab -- \ |
| | | --output_dir=$IMDB_DATA_DIR \ |
| <pre>
| | --dataset=imdb \ |
| fc = tf.contrib.layers.fully_connected( | | --imdb_input_dir=/tmp/aclImdb \ |
| combined_message,
| | --lowercase=False |
| TEXT_SIZE + KEY_SIZE,
| |
| biases_initializer=tf.constant_initializer(0.0),
| |
| activation_fn=None)
| |
| </pre> | | </pre> |
|
| |
|
| Next, we assemble the layers of the neural network model.
| | This uses a build rule called <code>gen_vocab</code> located in <code>data/BUILD</code>: |
| | |
| The model architecture we use is:
| |
|
| |
|
| <pre> | | <pre> |
| (Fully Connected) -> (Conv2D) -> (Conv2D) -> (Conv2D) -> (Squeeze) | | py_binary( |
| | name = "gen_vocab", |
| | srcs = ["gen_vocab.py"], |
| | deps = [ |
| | ":data_utils", |
| | ":document_generators", |
| | # tensorflow dep, |
| | ], |
| | ) |
| </pre> | | </pre> |
|
| |
|
| This performs a sequence of 1D convolutions (expands the message out, and squeezes it back down). | | This build vocabulary step is, unfortunately, failing. See this Github issue (1917): https://github.com/tensorflow/models/issues/1917 |
| | |
| <pre>
| |
| fc = tf.contrib.layers.fully_connected(
| |
| combined_message,
| |
| TEXT_SIZE + KEY_SIZE,
| |
| biases_initializer=tf.constant_initializer(0.0),
| |
| activation_fn=None)
| |
|
| |
|
| # Perform a sequence of 1D convolutions (by expanding the message out to 2D
| | ==Adversarial Image Network== |
| # and then squeezing it back down).
| |
| fc = tf.expand_dims(fc, 2)
| |
| # 2,1 -> 1,2
| |
| conv = tf.contrib.layers.conv2d(
| |
| fc, 2, 2, 2, 'SAME', activation_fn=tf.nn.sigmoid)
| |
| # 1,2 -> 1, 2
| |
| conv = tf.contrib.layers.conv2d(
| |
| conv, 2, 1, 1, 'SAME', activation_fn=tf.nn.sigmoid)
| |
| # 1,2 -> 1, 1
| |
| conv = tf.contrib.layers.conv2d(
| |
| conv, 1, 1, 1, 'SAME', activation_fn=tf.nn.tanh)
| |
| conv = tf.squeeze(conv, 2)
| |
| return conv
| |
| </pre>
| |
| | |
| ====AdversarialCrypto Class - Creation of Message and Key====
| |
| | |
| In the constructor, the input message and key are generated using a <code>get_message_and_key()</code> method, which in turn calls a <code>batch_of_random_bools()</code> method. This is not complicated, it just constructs a vector of booleans.
| |
| | |
| Here is the method in the AdversarialCrypto class:
| |
| | |
| <pre>
| |
| def get_message_and_key(self):
| |
| """Generate random pseudo-boolean key and message values."""
| |
| | |
| batch_size = tf.placeholder_with_default(FLAGS.batch_size, shape=[])
| |
| | |
| in_m = batch_of_random_bools(batch_size, TEXT_SIZE)
| |
| in_k = batch_of_random_bools(batch_size, KEY_SIZE)
| |
| return in_m, in_k
| |
| </pre>
| |
| | |
| and the batch_of_random_bools method that it calls:
| |
| | |
| <pre>
| |
| def batch_of_random_bools(batch_size, n):
| |
| """Return a batch of random "boolean" numbers.
| |
| Args:
| |
| batch_size: Batch size dimension of returned tensor.
| |
| n: number of entries per batch.
| |
| Returns:
| |
| A [batch_size, n] tensor of "boolean" numbers, where each number is
| |
| preresented as -1 or 1.
| |
| """
| |
|
| |
| as_int = tf.random_uniform(
| |
| [batch_size, n], minval=0, maxval=2, dtype=tf.int32)
| |
| expanded_range = (as_int * 2) - 1
| |
| return tf.cast(expanded_range, tf.float32)
| |
| </pre>
| |
| | |
| This creates a random uniform tensor of 1s and -1s. Here's a quick interactive iPython session to illustrate:
| |
| | |
| <pre>
| |
| In [1]: import tensorflow as tf
| |
| | |
| In [2]: tf.InteractiveSession()
| |
| 2017-10-26 00:24:11.694267: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
| |
| 2017-10-26 00:24:11.694303: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
| |
| 2017-10-26 00:24:11.694313: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
| |
| 2017-10-26 00:24:11.694321: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
| |
| Out[2]: <tensorflow.python.client.session.InteractiveSession at 0x11462bba8>
| |
| | |
| In [3]: as_int = tf.random_uniform([10,2], minval=0, maxval=2, dtype=tf.int32)
| |
| | |
| In [4]: as_int.eval()
| |
| Out[4]:
| |
| array([[0, 0],
| |
| [1, 1],
| |
| [0, 0],
| |
| [0, 1],
| |
| [1, 0],
| |
| [0, 0],
| |
| [0, 1],
| |
| [1, 1],
| |
| [0, 0],
| |
| [1, 0]], dtype=int32)
| |
| | |
| In [5]: expanded_range = (as_int*2)-1
| |
| | |
| In [6]: expanded_range.eval()
| |
| Out[6]:
| |
| array([[-1, 1],
| |
| [ 1, 1],
| |
| [-1, 1],
| |
| [ 1, -1],
| |
| [ 1, 1],
| |
| [-1, 1],
| |
| [ 1, -1],
| |
| [-1, -1],
| |
| [-1, -1],
| |
| [ 1, -1]], dtype=int32)
| |
| </pre>
| |
| | |
| | |
| ====Do Evaluation Method====
| |
| | |
| We now come to the definition of the function <code>doeval()</code> . This function evaluates how good the cryptosystem the Alice/Bob/Eve networks have created are at minimizing Eve's ability to decrypt the message with the ciphertext only and maximizing Bob's ability to decrypt the message with a ciphertext and key.
| |
| | |
| This method evaluates only, and does not train. It runs the networks on a batch of n messages, computes the percentage of bits lost by Bob and Eve, prints the percentages, and returns them. These values are used to determine when to stop training the networks.
| |
| | |
| The method header takes a few arguments:
| |
| * The TensorFlow session
| |
| * The AdversarialCrypto class instance
| |
| * The number of iterations that should be run
| |
| * The iteration count to write to the log
| |
| | |
| <pre>
| |
| def doeval(s, ac, n, itercount):
| |
| """
| |
| Evaluate the current network on n batches of random examples.
| |
| Args:
| |
| s: The current TensorFlow session
| |
| ac: an instance of the AdversarialCrypto class
| |
| n: The number of iterations to run.
| |
| itercount: Iteration count label for logging.
| |
| Returns:
| |
| Bob and Eve's loss, as a percent of bits incorrect.
| |
| """
| |
| </pre>
| |
| | |
| The main role of the doeval function is to run the neural network, and compute the losses that result. The TensorFlow session variable <code>s</code> will contain all three neural networks on its graph, so we can just call <code>s.run()</code> without needing to specify all three graphs.
| |
| | |
| Link to Session class documentation: https://www.tensorflow.org/api_docs/python/tf/Session
| |
| | |
| Link to Session.run documentation: https://www.tensorflow.org/api_docs/python/tf/Session#run
| |
| | |
| The run method is passed a list of loss functions owned by the AdversarialCrypto class - these were the sums of incorrect bits defined above (<code>self.bob_reconstruction_loss = tf.reduce_sum(self.bob_bits_wrong)</code>).
| |
| | |
| The reduce_sum function simply sums tensor components along a particular axis, thus reducing the dimensionality of the tensor. In this case we are summing the incorrect bits along the "axis" of the message.
| |
| | |
| Link to reduce_sum documentation: https://www.tensorflow.org/api_docs/python/tf/reduce_sum
| |
| | |
| When we call <code>s.run()</code>, we can pass a single graph element, or an arbitrarily nested collection containing graph elements at the leaves. The graph element can be an operation, a tensor, a tensor handle, or a string naming a tensor or operation in the graph; in this case, we pass a list of two tf.reduce_sum operations.
| |
| | |
| Because the <code>s.run()</code> method is passed two values, it will return two values - the corresponding values returned from the two operations. Thus, the variable bl contains the result of the tf.reduce_sum call for Bob's network (the number of incorrect bits from Bob's network), while el contains the result of the tf.reduce_sum call for Eve's network.
| |
| | |
| <pre>
| |
| bob_loss_accum = 0
| |
| eve_loss_accum = 0
| |
| for _ in xrange(n):
| |
| bl, el = s.run([ac.bob_reconstruction_loss, ac.eve_loss])
| |
| bob_loss_accum += bl
| |
| eve_loss_accum += el
| |
| </pre>
| |
| | |
| These error values are set to zero for each batch, and are accumulated across the batch.
| |
| | |
| At the end of the batch of runs, the loss percentages are computed and returned.
| |
| | |
| <pre>
| |
| bob_loss_percent = bob_loss_accum / (n * FLAGS.batch_size)
| |
| eve_loss_percent = eve_loss_accum / (n * FLAGS.batch_size)
| |
| print('%d %.2f %.2f' % (itercount, bob_loss_percent, eve_loss_percent))
| |
| sys.stdout.flush()
| |
| return bob_loss_percent, eve_loss_percent
| |
| </pre>
| |
| | |
| ====Training Methods====
| |
| | |
| There are two training methods:
| |
| * train_until_thresh() - trains the network for a specified number of training loops. This method quits early and returns true if the loss function targets are achieved. Otherwise it returns false.
| |
| * train_and_evaluate() - trains the network, then trains Eve for extra rounds. Also resets Eve a specified number of times, giving Eve the chance to re-train and alternative ways to break Alice and Bob's cryptosystem.
| |
| | |
| The <code>train_until_thresh()</code> method takes a session and an AdversarialCrypto object. It runs for a specified maximum number of training loops. Within each training loop, it trains Bob's network, then trains Eve's network:
| |
| | |
| <pre>
| |
| def train_until_thresh(s, ac):
| |
| for j in xrange(MAX_TRAINING_LOOPS):
| |
| for _ in xrange(ITERS_PER_ACTOR):
| |
| s.run(ac.bob_optimizer)
| |
| for _ in xrange(ITERS_PER_ACTOR * EVE_MULTIPLIER):
| |
| s.run(ac.eve_optimizer)
| |
| </pre>
| |
| | |
| It is important to note here that bob_optimizer and eve_optimizer are both calling minimize() on the same AdamOptimizer object, but passing different loss functions to it. From above, the optimizer was defined as:
| |
| | |
| <pre>
| |
| optimizer = tf.train.AdamOptimizer(learning_rate=FLAGS.learning_rate)
| |
| </pre>
| |
| | |
| and bob_optimizer calls optimizer.minimize() on the sum of incorrect bits from Bob's network, while eve_optimizer calls optimizer.minimize() on the sum of incorrect bits from Eve's network.
| |
| | |
| Next, every M steps, the evaluation method is called. This prints progress, but also checks if the target losses have been achieved. If they have, the network returns true, ending the training early.
| |
| | |
| <pre>
| |
| if j % PRINT_EVERY == 0:
| |
| bob_avg_loss, eve_avg_loss = doeval(s, ac, EVAL_BATCHES, j)
| |
| if (bob_avg_loss < BOB_LOSS_THRESH and eve_avg_loss > EVE_LOSS_THRESH):
| |
| print('Target losses achieved.')
| |
| return True
| |
| return False
| |
| </pre>
| |
| | |
| ==Adversarial Text== | |
|
| |
|
| =Flags= | | =Flags= |