Shared Flashcard Set

Details

Title

CS411 Final

Description

finasl

Total Cards

Subject

Computer Science

Level

Undergraduate 3

Created

12/16/2024

Click here to study/print these flashcards.

Create your own flash cards! Sign up here.

Additional Computer Science Flashcards

Cards Return to Set Details

Term

How should passwords be stored?

Definition

- Dont store as clear text
- hash the password (with salt) and store the hashed value
- dont use SHA1 (its weak)

Term

Why salt passwords?

Definition

- salt prevents identical passwords
from having identical hashes

Term

What are some tradeoffs in security design

Definition

- (Services offered) versus level of security

- (Ease of use) versus security

- (Cost of security) versus cost of loss

Term

Examples of network threats

Definition

- Unauthorized access
- Impersonation
- Denial of service

Term

What is Form field code injection

Definition

HTML forms in which data are not validated are open to SQL injection

ALWAYS escape input to a form before passing the data to the next page or script

ALWAYS assume that any data input by a user is malicious

Term

Definition

SSL handshake (RSA) without keyless SSL
Public Key Encryption Operation
Public Key Signature Operation
Authorization based on 3rd party authentication

Term

What is SOP?

Definition

Single Origin Policy/Same Origin Policy
- Your webpage should only access your own server, not other servers/webpages.

Term

What is CORS?

Definition

Cross-Origin Resource sharing
- When a webpage wants access another secure page, I first ask the secure page if it trusts the page i'm on to make that request.

Term

What is a pre-flight request?

Definition

- Part of CORS
- Used to check whether the server will permit a cross-origin request with specific HTTP methods or headers before the actual request is made.

Term

What are CSRF Tokens?

Definition

When a website wants to accept form input, it just gets an HTTP request, and doesn't know if it corresponds to a button press. To solve this the webpage generates a random token (CSRF Token) associated with that instance of that form, hidden in the form itself.

Term

What is XSS?

Definition

- Cross-Site Scripting
- Dont trust user generated input

- My response will include a Content-Security-Policy header

Before loading a resource or running a script, check if its in the whitelist. The easiest way to do XSS is to inject inline styles and scripts. This can be disabled entirely via the CSP header

Before you load a resource or run a script, you check whether it is in my whitelist
The easiest way to do XSS is to inject inline styles and scripts. You can disable inline content entirely via the CSP header

Term

What is CSP?

Definition

Content security Policy
- Included with response
- before loading something, check if its in CSP whitelist

Term

What is SSL Hijacking in a Man-in-the-Middle (MITM) attack?

Definition

- SSL Hijacking occurs when an attacker intercepts and manipulates the communication between a client and a server during the SSL/TLS handshake process.

Term

What is MITM?

Definition

Man in the Middle Attack

Term

What is DDoS?

Definition

Distributed Denial of Service (DDoS) Attack

A DDoS attack uses a large number of “bots” on infected computers to make more requests than a server can handle, rendering it incapable of responding to legitimate requests

Term

Ways of limiting DDoS attacks?

Definition

- Rate Limiting

- Traffic Filtering

Term

What is Rate limiting in DDoS mitigation?

Definition

One form of mitigation is only responding to a certain number of requests per second from a given IP

Term

What is Traffic limiting in DDoS mitigation?

Definition

DDoS attacks can be mitigated by maintaining a list of IP addresses known to be infected

- Use a reverse proxy server
- CAPTCHA

Term

What is a Reverse Proxy Server? How does it help with DDoS mitigation?

Definition

Solve DDoS by caching and load balancing (Use Cloudflare)

Term

What are race conditions?

Definition

When two or more threads/processes attempt to read or write to a shared resource at the same time, the outcome may vary based on the scheduling and order of execution.

race conditions arise from non-deterministic intertwining of operations and a lack of proper synchronization mechanisms.

Race conditions can lead to wrong or inconsistent results, like data curruptions, unexpected behavior, crashed, or other system failures.
- Hard to test because the behaviors are dependent on the specific timing and interconnection of operations.

Term

How do you detect race conditions?

Definition

In order to detect race conditions, it is necessary to develop use robust test cases that cover many scenerios and concurrent execution paths. Random inputs, and stress testing can increase the likelihood of finding a race condition. If weird behavior arises, data corruption occurs, or crashed happen, there might be a RC.

Preforming a thorough review of the code, keeping in mind where shared resources are, like shared variables or data structures. Also look for potential scenerios where a lack of synchronization or bad synchronization might lead to a race condition.

static code analysis tools are made to detect potential race conditions. they look for potential concurrency issues, improper synchronization, or shared resource access violations.

- Debugging and Logging

Concurrency Analysis Tools detect race conditions in concurrent programs. they use dynamic analysis, runtime monitoring, and state-space exploration, find potential race conditions.

Profiling and Performance Analysis: look for unexpected behavior that may indicate race conditions. Analyze resource utilization, synchronization patterns, and thread/process interactions to identify potential concurrency issues.

Term

How do you debug race conditions?

Definition

Reproduce the Issue: Try to create a reliable way to reproduce the race condition. Identify the specific inputs or conditions that trigger the issue. This can involve manipulating timing, data, or other factors to increase the likelihood of the race condition occurring.
Analyze Error Symptoms: Carefully analyze the symptoms and observe the behavior when the race condition occurs. Look for unexpected output, data corruption, crashes, or any other abnormal behavior. Collect as much information as possible about the observed symptoms to aid in the debugging process.

Logging and Instrumentation: Add logging statements or instrumentation to your code to trace the execution flow and capture relevant information during the race condition occurrence. Log thread IDs, shared resource access, timestamps, or other relevant data to aid in understanding the sequence of events leading to the issue.
Stress Testing: Introduce stress testing by increasing the workload, concurrency, or load on the system. This can help uncover race conditions that occur under high load or specific timing conditions. Stress testing can also expose race conditions that might be more challenging to reproduce in controlled environments.

Binary Search Method: If you suspect a specific section of code is responsible for the race condition, you can use a binary search method to isolate the problematic area. Temporarily disable or comment out parts of the code to narrow down the location of the race condition and identify the root cause step-by-step.

Term

What are three types of design patters?

Definition

Creational Patterns: patterns regarding making objects without introducing additional complexity

Structural Patterns: patterns regarding how to organize classes and objects to form larger structures and provide new functionality. Keep structures flexible and efficient

Behavioral Patterns: patterns of communication between objects.

Term

What does a project manager do?

Definition

Project Manager: Make sure projects run smoothly, handle scheduling, resource allocation, and scope to meet project goals.

• Product Manager: Own "customer"-facing "product," prioritize features, gather requirements, and define vision and strategy for the product.

• Program Manager: Oversees a portfolio of projects, ensures alignment with business goals, manages dependencies across projects.

Term

What is a project plan?

Definition

- For plan-driven development project

project plan examines available resources and make a schedule for doing everything.

Term

What are the sections of a project plan?

Definition

Plan sections:
- Introduction
- Project organization
- Risk analysis
- Hardware and software resource requirements
- Work breakdown
- Project schedule
- Monitoring and reporting mechanisms

Term

What is a Gantt chart?

Definition

A Gantt chart is a visual project management tool that represents a timeline of tasks or activities.

Term

What are OKRs and how do they work?

Definition

OKRs (Objectives and Key Results) are a goal-setting framework designed to define and track objectives and their measurable outcomes.

SMART Goals: Specific, Measurable, Achievable, Relevant, Time-Bound.

Quarterly Process:
1. Review the previous quarter’s Key Results (achievements and lessons learned).
2. Define and lock in next quarter’s Objectives and Key Results (cannot be changed).

:::EXAMPLE:::

Objective: Attract more users to new visual similarity recommendations.
Key Results:
• Increase in Customer Lifetime Value (CLV).
• A/B test results (Clicks, Add to Cart, Conversions).
• Year-over-Year (YOY) improvements in these metrics.
• Offline model evaluation and deployment of new model version.

Category: Goal Setting / Performance Management

Term

What is RACI

Definition

(Whos) Responsible / (Whos) Accountable / (Whos) Consulted / (Whos) Informed

Term

What is a priority grid?

Definition

A graph (impact x effort) that is color coded to show what is hard and easy. In other words what to do later and what to do next.

Term

What are types of maintenance?

Definition

Corrective Maintenance:
- Fixing defects and errors discovered in the software.

Adaptive Maintenance:
- Modifying the software to accommodate changes in the environment or external factors.

Perfective Maintenance:
- Improving the software's quality, performance, or maintainability.

Preventive Maintenance:
- Proactively addressing potential issues and preventing future problems.

Term

list some tools for project management?

Definition

- RACI
- OKRs
- Gantt
- project plan
- priority grid

Term

what kinds of data collection tools do we have for managing a projects performance?

Definition

- Profiling:
involves collecting data about the execution of the program using to identify performance bottlenecks, resource usage, and other metrics that can help in optimizing the application.

- Benchmarking:
is a process of evaluating and comparing the performance, efficiency, or other metrics of a software system, hardware component, or algorithm against a set of predefined standards.

- Monitoring:
refers to the continuous observation and measurement of various aspects of the system's performance, behavior, and health in real-time.

Term

What is the ELK stack?

Definition

Elasticsearch:
- Distributed
- Real-time
- Search and analytics engine
- stores, searches, and analyzes large amounts of data quickly.

Logstash:
- data processing pipeline
- ingests, transforms, and enriches log data from various sources.
- log files, message queues, databases

Kibana:
- web-based data visualization and exploration tool
- works with elastisearch
- user friendly

Term

What is reliability?

Definition

- The probability of failure-free system operation over a specified time in a given environment for a given purpose

Term

What is availability?

Definition

- The probability that a system, at a point in time, will be operational and able to deliver the requested services

Term

What are the differences between human errors, system faults, system errors, and system failures?

Definition

- Human error or mistake - Human behavior that results in the introduction of faults into a system

- System fault - A characteristic of a software system that can lead to a system error

- System error 0 An erroneous system state that can lead to system behavior that is unexpected by system users

- System failure - An event that occurs at some point in time when the system does not deliver a service as expected by its users

Term

Name the methods used for fault management

Definition

- Fault avoidance - The system is developed in such a way that human error is avoided and thus system faults are minimized

- Fault Detection - Verification and validation techniques are used to increase probability of detecting and correcting errors before the system goes into service are used.

- Fault tolerance - the system is designed so that faults in the delivered software do not result in system failure

Term

How do you prove that a formal model is true for all valid inputs to the system?

Definition

- you can use an automated theorem prover (ATP) or verification tool to test all possible inputs to the system

Term

What is the CI(continuous integration)/CD(continuous development) pipeline?

Definition

Plan —> Code —> Build —> Test —> Release —> Deploy —> Operate —> Monitor —> Plan

Term

What is the difference between CI and CD?

Definition

- CI - automatically builds, tests, and integrates changes within a shared repository
THEN —>
- CD - automatically deploys code changes to customers directly

Term

What are DAG runners and why use them?

Definition

- DAG runners orchestrate the execution of tasks (which are organized in a DAG) based on their dependencies to ensure proper execution.

- They enable automation, scheduling, parallelism, monitoring, and error handling

Term

What is docker compose? What are its key features?

Definition

- Docker Compose - tool for defining and managing multi-container applications.

Key features:
- Service definitions: Docker compose uses a compose file to define the services that make up your application. Each service represents a containerized component of the application stack like a web server, database, or worker.

- Orchestration: Docker compose orchestrates the creation, configuration, and management of multiple containers defined in the compose file. Ensures proper coordination and connectivity between the containers.

- Easy configuration and deployment: You can define your application/s configuration and dependencies in a single file, making it easy to deploy your appication consistently across different environments.

- Network and volume management: simplifies the management of networks and volumes required by your application. Lets you make custom networks for container communication and manage shared volumes for data persistence.

Term

Describe the relationship between threads, cores, and processes

Definition

- A thread = smallest unit of execution. Consists of a thread ID, program counter, register set, and a stack.

- Core = independent processing unit within a CPU that can execute instructions, read and write to memory, and perform I/O operations. Each core can only run one thread at a time.

- Process = an instance of a computer program that is being executed. Contains the program code & current activity.

- Each process runs in its own memory space & requires a context switch to communicate with other processes. May contain multiple threads and each process within a thread. Threads within the same process have shared state

Extra fun fact: Process can run 1 thread/core simultaneously (a computer can do 1 thing per thread), so to do multiple things at once they just switch really really fast between them

Term

What is the difference between I/O bound processes and CPU bound processes?

Definition

- I/O bound processes - limited by the rate at which data is transferred btwn system and external devices like hard drives, networks, or user input

- CPU bound process - limited by the rate at which the processor can compute

Term

What is the difference between concurrency and parallelism?

Definition

- Concurrency = utilize asynchronous programing or multi-processing to avoid blocking operations, allowing the CPU to perform other tasks during I/O operations.

- Parallelism = use of multi-threading or multi-processing, which allows simultaneous data processing.

Term

What are the methods used to achieve concurrency?

Definition

- Asynchronous programming - program starts an I/O operation, then yields execution. When the I/O operation is complete, execution can be resumed. Allows single threads to handle many concurrent I/O bound tasks.

- Event-driven programming - Driven by events like user actions or messages from other programs. Allows the system to react to I/O events as they occur instead of constantly polling for I/O status or waiting on I/O operations.

- Non-blocking I/O and callbacks - Involves starting an I/O operation then doing other work. When the I/O op is done, a callback function is called to handle the rest. Ensures that the CPU isn’t waiting for I/O operations to complete and can continue doing other work.

- Cooperative multitasking/coroutines: Coroutines are subroutines that allow multiple entry points for suspending and resuming executions at certain locations, enabling cooperative multitasking.

- Actor Model - Treats “actors” as the “universal primitives of concurrent computation (whatever that means). In response to a message that an actor receives, it can make local decisions, create more actors, send more messages, and determine how to respond to the next message received.

- Data Parallelism - involves distributing subsets of the same data across different cores or threads, and computing on them in parallel.

Term

Name some major problems with parallelism

Definition

- Deadlock: Everybody is waiting on everybody else
- Starvation: Someone is waiting for something and never gets it (Not really a problem. The OS handles it)

Term

What makes a good split point?

Definition

Cohesion:
- good split points result in cohesive components with clear purpose
- minimize dependencies on external models

Loose Coupling:
- minimized dependencies between components for loose coupling
- encapsulate interactions and dependencies
- let components change independently

Testability:
- find split points that allow testing in isolation
- should be able to test on an airplane.

Term

What is Client-Server Architecture?

Definition

request vs response

Client:
- component
- makes requests 
- clients actively initiate transactions

Server:
- component
- fulfills requests 
- servers react to client requests

Term

What is a multi-tier CS?

Definition

A multitier (N-tier) architecture is an expansion of the 3-tier architecture

extra tiers do:
- Replication of the function of a tier
- Specialization of function within a tier
- Portal services, like handling incoming web traffic

Term

what is a merge --squash

Definition

A merge --squash creates a new commit that contains all the changes from one branch (and the full history of the other branch)

Term

What is a process framework?

Definition

- A set of guidelines, work products, and tools that attempt to facilitate a process

Term

What are the steps of the SDLC (Software Development Life Cycle)?

Definition

- Define, design, develop, deliver, DMAINTAIN

Term

What is the difference between a prescriptive and agile process?

Definition

- Prescriptive processes - all of the process activities are planned in advance. Progress is measured against this plan.

- Agile processes - planning is incremental & it’s easier to change the process to meet changing requirements

Term

Describe the waterfall model

Definition

- The Waterfall Model:

Reqs definition —> system & software design —> implementation & unit testing —> integration & system testing —> operation & maintenance

- CONS - complete upfront specifications, over-engineering, late integration & test, reliable upfront estimates & schedules, limited value for software which tends to change pretty fast

Term

Describe incremental development and name its pros/cons

Definition

- Incremental Development - Outline Description —> Specification(—> initial version) /Development (—> intermediate versions) /Validation(—> final version)

PROS - much more agile than waterfall. Easier to get customer feedback.
CONS - as development continues system structure gets weaker. Hard to tell where the process is at in development.

Term

Describe the XP release cycle

Definition

- Select user stories for this release —> break down stories to tasks —> plan release —> develop/integrate/test software —> release software —> evaluate system —> repeat

XP Heirarchy:
Theme/Initiative
Epic
User Story
Task
Subtask/Ticket

Term

What is incremental delivery?

Definition

- Incremental Delivery - Deploy an increment for use by end-users —> more realistic evaluation about practical use of software. Difficult to implement for replacement systems (increments have less functionality than the system being replaced)

Define outline requirements —> assign requirements to increments —> design system architecture —> develop system increment —> validate increment —> integrate increment —> validate system —> deploy increment —> (if system incomplete) develop system increment

Term

What is the difference between containerization and virtualization?

Definition

Containerization vs Virtualization:
- Virtual machine = fake hardware | Strong isolation, configurable resource utilization, dedicated hardware resources, live migration
- Containers = fake OS | Lightweight, faster startup & scaling, efficient resource utilization, isolation w/o overhead

Term

What are goals vs requirements vs non goals

Definition

- Goals - what problems is it supposed to solve?
- Requirements - Non-functional, functional, domain - what is it supposed to achieve?, what is it supposed to actually do?, what does it have to do? (Eg. Compliance with something)
- Non-Goals - What is out of scope for this project?

Term

What's the difference between user requirements, system requirements, and use case?

Definition

- User Requirements - Statements in natural language + diagrams of the services the system provides & its operational constraints. Written for customers
- System Requirements - A structured document setting out detailed descriptions of the systems functions, services, and operational constraints. Basically a contract btwn client & contractor
- Use Case - Describes how a system will behave in broad areas. (Eg. The use case for acc management includes the user changing a password.)

Term

Requirements validation vs Requirements verification?

Definition

- Requirements Validation - Am I building the right product?

- Requirements Verification - Am I building the product right?

Term

Use case vs user story?

Definition

Use Case vs User Story Use case = formal description. Contains a lot of information User Story = less formal. As a ___ I want to ___ in order to - Can be conflicting - User stories have a benefit. Features are NOT benefits.

Term

Software architecture vs software programming?

Definition

Software Architecture - Interactions among parts, structural properties, system-level performance, outside module boundary

Software Programming - Implementations of parts, computational properties, algorithmic performance, inside module boundary

Term

What is parnas partitioning? How are partitions decided?

Definition

Parnas Partitioning - Principle used to modularize software systems by dividing them into smaller, manageable, and loosely coupled components.
- Modularity - Breaking down the system into discrete models tat can be developed, tested, and maintained independently.
- Information Hiding - Each module hides its internal workings from the other modules, exposing only necessary interfaces.
- High Cohesion, Low Coupling - Each module has high internal cohesion, low coupling w/ other modules

How to Partition:
1. Identify Sys Reqs - Determine critical functionalities & their dependencies.
2. Decompose the system - Break down the system into major functional areas. Use top down, find potential modules
3. Define Module Boundaries - Establish clear boundaries for each module. Each module must have a distinct function
4. Ensure High Cohesion & Low Coupling - Related functionalities are grouped within modules, low inter-module dependence

Term

What is replication / specialization / load balancing? What are they for?

Definition

Replication describes having multiple running instances of a tier. This enhances reliability and availability (by adding redundancy).

Specialization describes making each tier responsible for a single function. This increases modularity

Load balancing is a tier that routes traffic to multiple copies of a tier. This (increases availability by) ensures that (each server is busy and) no server is overwhelmed

Term

What do unit tests test for?

Definition

- Valid inputs and outputs, as well as error handling

Term

Describe what mocks, stubs, and reflectors are.

Definition

- Reflectors are basically just stubs. Stubs both allow us to test incomplete systems by hardcoding answers
- Mocks let you fake calls to outside systems like API calls so you can test completely internally

Term

What is the difference between unit, integration, performance, smoke, and regression tests?

Definition

- Unit: does it work in a vacuum?
- Integration: does it work in its env?
- Performance: is it fast/good enough?
- Smoke: is it safe to deploy?
- Regression: is it still working/safe?

Term

What is REST?

Definition

- Representational state transfer. It is a set of principles and guidelines for building web services.

Term

What are the HTTP methods for APIs?

Definition

- GET: to retrieve data from the server
- POST: to submit data to the server to create a new resource, such as when creating a new record in a database
- PUT: to update an existing resource on the server
- DELETE: to delete a resource or collection of resources on the server

Term

What is the difference between synch and asynch blocking calls?

Definition

- synch: Wait until you get a result
- asynch: displays something in the wait time because the webpage has to render

Term

What is a callback function?

Definition

- A function that is passed as an argument to another function and is executed by that function once a specific event or condition occurs

- used by asynch programming so that a function can call another function and keep executing, so it can handle events and responses that may not be available immediately.

Term

What are the components of CRUD?

Definition

Data operations usually implement CRUD:

Create
Read
Update
Delete

these operations are atomic

Term

What is atomicity? How is it achieved?

Definition

- Atomicity: A guarantee that once a process has started to update data in a table, it will complete before another process starts to update the same data

- Atomicity is achieved by locking, where when a thread is being worked on, other threads are prevented from changing the data mid-stream.

Term

What does ACID stand for?

Definition

- A: Atomicity - all-or-nothing transactions

- C: Consistency - any transaction will result in the database being in a valid state

- I: Isolation - if transactions are done concurrently, the result is the same state that would have been reached had the transactions been done serially

- D: Durability - when a transaction has been committed to the database, it will remain in the dataset until it is updated by another transaction, even if the power is

Term

What does NoSQL mean?

Definition

NoSQL means anything that isn't a RDB (relational database).

Non-relational databases store all of the data necessary for a record into a single object

This can lead to a lot of duplicate data, but that's not always an issue

Term

What's the difference between a data lake and a data warehouse?

Definition

- Data lake: contains a lot (a lot a lot) of unstructured, barely-processed, raw data. Used for stuff like ML, AI, Streaming analytics etc.

- Data warehouse: smaller, structured, refined. (there was a question on warehouses on midterm 2)

Term

What is ETL?

Definition

- Stands for Exact, Transform, Load
- A process used in data warehousing to collect, process, and move data from multiple sources into a single, unified destination (eg. data warehouse, database, analytical system)

There are 2 ways to do ETL, but you should do both apparently:

1. Batch processing - long-running & scheduled
2. Stream processing - shorter & event-driven

Term

What is CAP theorem?

Definition

You can only choose 2 of the following:

1. (C)onsistency - get up to date data if a request goes through
2. (A)vailability - get data when requesting with certainty
3. (P)artition tolerance - the system remains available during a partition

Term

What are some ways consistency is achieved? (in the context of CAP theorem)

Definition

- Conflict Resolution: resolution mechanisms like last-writer-wins (LWW), first-writer-wins (FWW), or custom conflict resolution policies can be used to determine which update should take precedence and ensure consistency

- Distributed Consensus: Provide a way for multiple nodes in a distributed system to agree on a consistent order of operation & ensures that all nodes reach agreement on the order of updates

Term

When to use server-sided vs client sided caching?

Definition

Server side - sacrifices speed for correctness
Client side - sacrifice correctness for speed OR I have v strong assumptions abt how often data updates

Term

What is reverse-proxy caching?

Definition

- cache responses (like HTTP responses) on the disk rather than the data itself
- this removes the load from backend servers altogether
- if a server goes down, you can still process requests until the cache goes stale

Term

Scaling up vs scaling out?

Definition

Scaling up = buy a better box (upgrade)
Scaling out = buy more boxes (upscale)

Term

What is sharding?

Definition

- Each server is responsible for some subset of the data.
- PROS: Eases load on each server. Redundancy/fault tolerance
- CONS: Complexity. CAP theorem.

Term

What is cache ejection? What types of cache ejection are there? GOD I WANNA GO HOME

Definition

- When your cache is full and you get more data, you need to eject some old data in the cache to make room.

- LRU (least recently used): Use if there are NO trends in what resources are requested, but it's a good bet that if someone just asked for it, they'll ask for it again. Eject the least recently used entry
- LFU (least frequently used): Use if there are trends in what resources are requested

Flashcard Machine - create, study and share online flash cards

Shared Flashcard Set

Details

Additional Computer Science Flashcards

Cards Return to Set Details

My Flashcards

Flashcard Library

Browse

About

Help

Mobile