From charlesreid1

 
(14 intermediate revisions by the same user not shown)
Line 3: Line 3:
Adversarial neural networks use an architecture consisting of two separate neural networks - one network attempts to learn how to accomplish a task, and another network attempts to differentiate between the output of the first network and the "real" output.
Adversarial neural networks use an architecture consisting of two separate neural networks - one network attempts to learn how to accomplish a task, and another network attempts to differentiate between the output of the first network and the "real" output.


=TensorFlow Examples of Adversarial Neural Networks=
=TensorFlow Adversarial Examples=


==Adversarial Crypto==
==Adversarial Crypto==
Line 9: Line 9:
This adversarial crypto neural network attempts to learn how to protect communications using the adversarial architecture.
This adversarial crypto neural network attempts to learn how to protect communications using the adversarial architecture.


Link to paper: "Learning to Protect Communications with Adversarial Neural Cryptography": https://arxiv.org/abs/1610.06918
Paper: "Learning to Protect Communications with Adversarial Neural Cryptography"
 
Link to paper: https://arxiv.org/abs/1610.06918


Link to code: https://github.com/tensorflow/models/tree/master/research/adversarial_crypto
Link to code: https://github.com/tensorflow/models/tree/master/research/adversarial_crypto
Line 27: Line 29:
===The Model===
===The Model===


We'll step through the code line-by-line again. Here's the link to the code: https://github.com/tensorflow/models/blob/master/research/adversarial_crypto/train_eval.py
We'll step through the code line-by-line. Here's the link to the code: https://github.com/tensorflow/models/blob/master/research/adversarial_crypto/train_eval.py


====License====
Full model walkthrough is on the [[TensorFlow/Adversarial Crypto]] page.


Obligatory license info:
The rundown is:
* Create an AdversarialCrypto class that holds a training optimizer object for the Bob and Alice networks
* Define a method that evaluates the networks as-is and prints the percent losses
* Define a method that trains the network for a specified number of iterations, stopping early if the network reaches its target losses
* Define a method that calls the training function (above), then re-trains Eve several more times from scratch


<pre>
==Adversarial Text==
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
</pre>
 
Some info about the network:
* There are actually 3 neural networks involved: Alice, Bob, and Eve
* Alice takes inputs in_m (message), in_k (key) and outputs the ciphertext
* Bob takes inputs in_k (key), ciphertext and attempts to output the plaintext
* Even takes input ciphertext (no key) and also attempts to output the plaintext
 
The file starts with imports/declarations to be compatible with Python 3:
 
<pre>
# TensorFlow Python 3 compatibility
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import signal
import sys
from six.moves import xrange  # pylint: disable=redefined-builtin
import tensorflow as tf
</pre>
 
====Input Argument Flags and Parameters====
 
Hyperparameter flags can be set on the command line:
 
<pre>
flags = tf.app.flags
flags.DEFINE_float('learning_rate', 0.0008, 'Constant learning rate')
flags.DEFINE_integer('batch_size', 4096, 'Batch size')
FLAGS = flags.FLAGS
</pre>
 
The FLAGS stuff does not seem to be defined anywhere in the documentation, so the usage is not clear here. But, as an author on TF project states [https://stackoverflow.com/questions/33932901/whats-the-purpose-of-tf-app-flags-in-tensorflow#33938519 here], it is intended to make demos more convenient, and essentially wraps argparse.
 
Also see [[TensorFlow/Command Line Args]].
 
More parameter definitions follow:
 
<pre>
# Input and output configuration.
TEXT_SIZE = 16
KEY_SIZE = 16
 
# Training parameters.
ITERS_PER_ACTOR = 1
EVE_MULTIPLIER = 2  # Train Eve 2x for every step of Alice/Bob
# Train until either max loops or Alice/Bob "good enough":
MAX_TRAINING_LOOPS = 850000
BOB_LOSS_THRESH = 0.02  # Exit when Bob loss < 0.02 and Eve > 7.7 bits
EVE_LOSS_THRESH = 7.7
 
# Logging and evaluation.
PRINT_EVERY = 200  # In training, log every 200 steps.
EVE_EXTRA_ROUNDS = 2000  # At end, train eve a bit more.
RETRAIN_EVE_ITERS = 10000  # Retrain eve up to ITERS*LOOPS times.
RETRAIN_EVE_LOOPS = 25  # With an evaluation each loop
NUMBER_OF_EVE_RESETS = 5  # And do this up to 5 times with a fresh eve.
# Use EVAL_BATCHES samples each time we check accuracy.
EVAL_BATCHES = 1
</pre>
 
 
====Batch of Random Booleans====
 
This is a method to define an array of random booleans - this is used to create the message that Alice encrypts, and to define the key that Alice and Bob use to decrypt the message.
 
====Adversarial Crypto Class====
 
The Adversarial Crypto class defines the set of three neural networks used to do the adversarial network. As part of the training and evaluation process <code>train_and_evaluate()</code>, an instance of this class is created and passed to the evaluation function <code>doeval()</code> in the main body of the code.
 
What does this class do?
* Creates the three networks for Alice, Bob, and Eve
* Creates connections from Alice to Bob and Alice to Eve to pass the correct info to the correct networks
* Defines the loss function for Eve and for Bob
* Defines the optimizers that the networks should use
* Manages the state of each network (i.e., allows you to reset the Eve network)
 
<pre>
class AdversarialCrypto(object):
  """Primary model implementation class for Adversarial Neural Crypto.
  This class contains the code for the model itself,
  and when created, plumbs the pathways from Alice to Bob and
  Eve, creates the optimizers and loss functions, etc.
 
  Attributes:
    eve_loss:  Eve's loss function.
    bob_loss:  Bob's loss function.  Different units from eve_loss.
    eve_optimizer:  A tf op that runs Eve's optimizer.
    bob_optimizer:  A tf op that runs Bob's optimizer.
    bob_reconstruction_loss:  Bob's message reconstruction loss,
      which is comparable to eve_loss.
    reset_eve_vars:  Execute this op to completely reset Eve.
  """
</pre>
 
What does the constructor do?
* The constructor creates the Alice, Bob, and Eve model by calling the model() method with the right parameters
* Creates the optimizer for Bob and for Eve
* Sets up the loss function for Eve, based on <code>tf.reduce_sum()</code> and <code>optimizer.minimize()</code>
* Sets up the loss function for Bob, based on <code>tf.reduce_sum()</code>
 
<pre>
  def __init__(self):
    in_m, in_k = self.get_message_and_key()
    encrypted = self.model('alice', in_m, in_k)
    decrypted = self.model('bob', encrypted, in_k)
    eve_out = self.model('eve', encrypted, None)
 
    self.reset_eve_vars = tf.group(
        *[w.initializer for w in tf.get_collection('eve')])
 
    optimizer = tf.train.AdamOptimizer(learning_rate=FLAGS.learning_rate)


    # Eve's goal is to decrypt the entire message:
This trains a neural network model to detect the sentiment in IMDB text. This illustrates semi-supervised learning.
    eve_bits_wrong = tf.reduce_sum(
        tf.abs((eve_out + 1.0) / 2.0 - (in_m + 1.0) / 2.0), [1])
    self.eve_loss = tf.reduce_sum(eve_bits_wrong)
    self.eve_optimizer = optimizer.minimize(
        self.eve_loss, var_list=tf.get_collection('eve'))


    # Alice and Bob want to be accurate...
Link to code: https://github.com/tensorflow/models/tree/master/research/adversarial_text
    self.bob_bits_wrong = tf.reduce_sum(
        tf.abs((decrypted + 1.0) / 2.0 - (in_m + 1.0) / 2.0), [1])
    # ... and to not let Eve do better than guessing.
    self.bob_reconstruction_loss = tf.reduce_sum(self.bob_bits_wrong)
    bob_eve_error_deviation = tf.abs(float(TEXT_SIZE) / 2.0 - eve_bits_wrong)
    # 7-9 bits wrong is OK too, so we squish the error function a bit.
    # Without doing this, we often tend to hang out at 0.25 / 7.5 error,
    # and it seems bad to have continued, high communication error.
    bob_eve_loss = tf.reduce_sum(
        tf.square(bob_eve_error_deviation) / (TEXT_SIZE / 2)**2)


    # Rescale the losses to [0, 1] per example and combine.
==Running==
    self.bob_loss = (self.bob_reconstruction_loss / TEXT_SIZE + bob_eve_loss)


    self.bob_optimizer = optimizer.minimize(
Running this model is slightly more complicated than running the adversarial crypto network.
        self.bob_loss,
        var_list=(tf.get_collection('alice') + tf.get_collection('bob')))
</pre>


====Creation of Neural Network Model====
The adversarial text network steps are as follows:
* fetch data
* generate vocab
* generate training/validation/test data
* pretrain language model
* train classifier
* evaluate classifier on test data


Now, the actual creation of the models for Alice, Bob, and Eve happens in the call to <code>model()</code>. What happens with the method header?
===Get Vocabulary Data===
* We pass in the name of the graph component ('alice', 'bob', or 'eve') to add new model components to
* We pass in the input message (either the plain text, to Alice, or the ciphertext, to Bob and Eve)
* We pass in the key (optional); if no key is passed in, the input to the neural network is just the message


Here's the model method definition:
Start by obtaining the data, which is an 80 MB tar file, and decompress it:


<pre>
<pre>
  def model(self, collection, message, key=None):
$ wget http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz -O /tmp/imdb.tar.gz
    """The model for Alice, Bob, and Eve. If key=None, the first FC layer
    takes only the message as inputs. Otherwise, it uses both the key
    and the message.
    Args:
      collection:  The graph keys collection to add new vars to.
      message:  The input message to process.
      key:  The input key (if any) to use.
    """


    if key is not None:
$ tar -xf /tmp/imdb.tar.gz -C /tmp
      combined_message = tf.concat(axis=1, values=[message, key])
    else:
      combined_message = message


$ du -hs /tmp/aclImdb
487M /tmp/aclImdb
</pre>
</pre>


If we pass in both a message and a key, we concatenate the inputs using <code>tf.concat()</code>. Otherwise, the only input is the message.
===Build the Vocabulary===
 
The next step is to call <code>tf.contrib.framework.arg_scope()</code>. The [https://www.tensorflow.org/api_docs/python/tf/contrib/framework/arg_scope documentation] for this function will loop over each TensorFlow model graph passed to it, and add a <code>@add_arg_scope</code> decorator to it.


In other words, every time we have a fully_connected layer and a conv2d layer, we set them up to be on the specified graph (Alice, Bob, or Eve):
Use a Bazel job to build the vocabulary from the data:


<pre>
<pre>
    # Ensure that all variables created are in the specified collection.
$ IMDB_DATA_DIR=/tmp/imdb
    with tf.contrib.framework.arg_scope(
        [tf.contrib.layers.fully_connected, tf.contrib.layers.conv2d],
        variables_collections=[collection]):
</pre>


Next, we create a fully connected neural network layer. We pass in the message (and optionally the key), give the layer a size (the text length, and optionally the key length), we initialize the bias of the fully-connected layer as all-zero, and do not set an activation function:
$ bazel run data:gen_vocab -- \
 
    --output_dir=$IMDB_DATA_DIR \
<pre>
     --dataset=imdb \
     fc = tf.contrib.layers.fully_connected(
    --imdb_input_dir=/tmp/aclImdb \
          combined_message,
    --lowercase=False
          TEXT_SIZE + KEY_SIZE,
          biases_initializer=tf.constant_initializer(0.0),
          activation_fn=None)
</pre>
</pre>


Next, we assemble the layers of the neural network model.
This uses a build rule called <code>gen_vocab</code> located in <code>data/BUILD</code>:
 
The model architecture we use is:


<pre>
<pre>
(Fully Connected) -> (Conv2D) -> (Conv2D) -> (Conv2D) -> (Squeeze)
py_binary(
    name = "gen_vocab",
    srcs = ["gen_vocab.py"],
    deps = [
        ":data_utils",
        ":document_generators",
        # tensorflow dep,
    ],
)
</pre>
</pre>


This performs a sequence of 1D convolutions (expands the message out, and squeezes it back down).
This build vocabulary step is, unfortunately, failing. See this Github issue (1917): https://github.com/tensorflow/models/issues/1917
 
<pre>
    fc = tf.contrib.layers.fully_connected(
          combined_message,
          TEXT_SIZE + KEY_SIZE,
          biases_initializer=tf.constant_initializer(0.0),
          activation_fn=None)
 
      # Perform a sequence of 1D convolutions (by expanding the message out to 2D
      # and then squeezing it back down).
      fc = tf.expand_dims(fc, 2)
      # 2,1 -> 1,2
      conv = tf.contrib.layers.conv2d(
          fc, 2, 2, 2, 'SAME', activation_fn=tf.nn.sigmoid)
      # 1,2 -> 1, 2
      conv = tf.contrib.layers.conv2d(
          conv, 2, 1, 1, 'SAME', activation_fn=tf.nn.sigmoid)
      # 1,2 -> 1, 1
      conv = tf.contrib.layers.conv2d(
          conv, 1, 1, 1, 'SAME', activation_fn=tf.nn.tanh)
      conv = tf.squeeze(conv, 2)
      return conv
</pre>


==Adversarial Text==
==Adversarial Image Network==


=Flags=
=Flags=

Latest revision as of 00:13, 27 October 2017

Adversarial Neural Networks

Adversarial neural networks use an architecture consisting of two separate neural networks - one network attempts to learn how to accomplish a task, and another network attempts to differentiate between the output of the first network and the "real" output.

TensorFlow Adversarial Examples

Adversarial Crypto

This adversarial crypto neural network attempts to learn how to protect communications using the adversarial architecture.

Paper: "Learning to Protect Communications with Adversarial Neural Cryptography"

Link to paper: https://arxiv.org/abs/1610.06918

Link to code: https://github.com/tensorflow/models/tree/master/research/adversarial_crypto

Part of the tensorflow models repository (https://github.com/tensorflow/models/tree/master/research).

Running

To train the network:

$ python train_eval.py

The approach used by the training is to train the "defender" network (representing the Alice-Bob channel) until it is sufficiently well-trained, then reset the "attacker" network (representing the eavesdropper Eve) from scratch to give the eavesdropper multiple opportunities to find weaknesses in the cryptosystem.

The Model

We'll step through the code line-by-line. Here's the link to the code: https://github.com/tensorflow/models/blob/master/research/adversarial_crypto/train_eval.py

Full model walkthrough is on the TensorFlow/Adversarial Crypto page.

The rundown is:

  • Create an AdversarialCrypto class that holds a training optimizer object for the Bob and Alice networks
  • Define a method that evaluates the networks as-is and prints the percent losses
  • Define a method that trains the network for a specified number of iterations, stopping early if the network reaches its target losses
  • Define a method that calls the training function (above), then re-trains Eve several more times from scratch

Adversarial Text

This trains a neural network model to detect the sentiment in IMDB text. This illustrates semi-supervised learning.

Link to code: https://github.com/tensorflow/models/tree/master/research/adversarial_text

Running

Running this model is slightly more complicated than running the adversarial crypto network.

The adversarial text network steps are as follows:

  • fetch data
  • generate vocab
  • generate training/validation/test data
  • pretrain language model
  • train classifier
  • evaluate classifier on test data

Get Vocabulary Data

Start by obtaining the data, which is an 80 MB tar file, and decompress it:

$ wget http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz -O /tmp/imdb.tar.gz

$ tar -xf /tmp/imdb.tar.gz -C /tmp

$ du -hs /tmp/aclImdb
487M	/tmp/aclImdb

Build the Vocabulary

Use a Bazel job to build the vocabulary from the data:

$ IMDB_DATA_DIR=/tmp/imdb

$ bazel run data:gen_vocab -- \
    --output_dir=$IMDB_DATA_DIR \
    --dataset=imdb \
    --imdb_input_dir=/tmp/aclImdb \
    --lowercase=False

This uses a build rule called gen_vocab located in data/BUILD:

py_binary(
    name = "gen_vocab",
    srcs = ["gen_vocab.py"],
    deps = [
        ":data_utils",
        ":document_generators",
        # tensorflow dep,
    ],
)

This build vocabulary step is, unfortunately, failing. See this Github issue (1917): https://github.com/tensorflow/models/issues/1917

Adversarial Image Network

Flags