Task For Privacy-Preserving Machine Learning

Confidential Computing

Privacy-Preserving Machine Learning

目标：学习隐私保护的机器学习相关论文方法 (重点)。

Paper	Comment	Source	Todo
shadownet_a_secure_and_efficient_on_ device_model_inference_system_for_ convolutional_neural_networks.pdf		SP 2023
goten_gpu_outsourcing_trusted_execution _of_neural_network_training.pdf		AAAI 2021
oblivious_multi_party_machine_learning_on _trusted_processors.pdf		2016 SS
slalom_fast_verifiable_and_private_execution _of_neural_networks_in_trusted_hardware.pdf	Slalom, recently proposed by Tramer and Boneh, is the first solution that leverages both GPU (for efficient batch computation) and a trusted execution environment (TEE) (for minimizing the use of cryptography).	ICLR 2019	✅
delphi_a_cryptographic_inference_service_ for_neural_networks.pdf	Delphi is based on Gazelle and uses homomorphic encryption, grabled circuits, and secret shares to achieve client and server privacy protection in neural network.	Secur. Symp 2019	✅

📌 挑战任务:

🌟 调研隐私保护机器学习相关论文并分享，给出一份 Slide。

Trusted Execution Environment

目标：了解可信执行环境 (TEE) 概念，掌握 TEE 编程模式，了解 Intel SGX/Arm Trustzone 设计思想异同，架构优缺点。

intel_sgx_explained.pdf

OP-TEE is a Trusted Execution Environment (TEE) designed as companion to a non-secure Linux kernel running on Arm.

📌 挑战任务：

🌟 实现一个 Intel SGX demo，给出对应代码实现，学习过程总结。

🌟 实现一个 Arm Trustzone demo，给出对应代码实现，学习过程总结。

🌟🌟 了解 SGX/Trustzone 架构异同及优缺点，给出一份 Slide。

Machine Learning

Transformer-Based Generative Models

目标：了解基于 Transformer 架构的生成式大模型网络结构。

NIPS-2017-attention-is-all-you-need-Paper.pdf

d2l-attention-mechanisms

📌 挑战任务：

🌟 以 Llama2 为代表，总结网络结构，计算过程，给出一份总结的 Slide。

🌟🌟 在单机节点运行 Llama2，尝试测定网络中不同函数的 CPU/GPU 的计算负载，给出测定方法及结果的文档。

🌟🌟🌟 尝试评估 Llama2 权重参数的重要性，给出评估方法及结果的文档。

Distributed Traning Framework

目标：学习分布式训练框架，了解深度学习的分布式训练过程，了解分布式训练过程的瓶颈及一般优化方法。

a_survey_on_distributed_machine_learning.pdf

gpipe_easy_scaling_with_micro_batch_pipeline_parallelism.pdf

horovod_fast_and_easy_distributed_deep_learning_in_tensorflow.pdf

📌 挑战任务：

🌟 调研分布式训练相关论文，了解相关优化方法，给出一份 Slide。

🌟🌟 学习分布式训练框架，掌握分布式训练框架的一般设计思想，代码组织结构，给出一份 Slide。

Tensor Computing Engine

目标：学习张量计算引擎实现，以 C/C++/Rust 实现基本张量计算引擎。

Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.

Aten is the implementation of Pytorch Tensor Computing Engine.

BLAS (Basic Linear Algebra Subprograms) are routines that provide standard building blocks for performing basic vector and matrix operations.

📌 挑战任务：

🌟🌟 学习一种张量计算引擎，掌握张量计算引擎的一般设计思想，代码组织结构，给出一份 Slide。

🌟🌟🌟 以 C/C++/Rust 编程语言尝试实现支持线性计算、激活函数、自动求导的张量计算引擎库。

📌 挑战任务:

🌟 了解联邦学习 (Federated Learning) 基本概念，并给出 Slide 分享。

FederatedAI

🌟 了解安全多方计算 (Multi-Party Computation) 基本概念，并给出 Slide 分享。

🌟 了解同态加密 (Homomorphic Encrpytion) 基本概念，并给出 Slide 分享。

🌟 了解深度学习量化方法 (Quantization in Deep Learning) 基本概念及方法，并给出 Slide 分享。